You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@brooklyn.apache.org by Alex Heneveld <al...@cloudsoft.io> on 2022/09/08 08:35:00 UTC

Declarative Workflow update & shorthand/DSL

Hi team,

An initial PR with a few types and the ability to define an effector is
available [1].

This is enough for the next steps to be parallelized, e.g. new steps
added.  The proposal has been updated with a work plan / list of tasks
[2].  Any volunteers to help with some of the upcoming tasks let me know.

Finally I've been thinking about the "shorthand syntax" and how to bring us
closer to Peter's proposal of a DSL.  The original proposal allowed instead
of a map e.g.

step_sleep:
  type: sleep
  duration: 5s

or

step_update_service_up:
  type: set-sensor
  sensor:
    name: service.isUp
    type: boolean
  value: true

being able to use a shorthand _map_ with a single key being the type, and
value interpreted by that type, so in the OLD SHORTHAND PROPOSAL the above
could be written:

step_sleep:
  sleep: 5s

step_update_service_up:
  set-sensor: service.isUp = true

Having played with syntaxes a bit I wonder if we should instead say the
shorthand DSL kicks in when the step _body_ is a string (instead of a
single-key map), and the first word of the string being the type, and the
remainder interpreted by the type, and we allow it to be a bit more
ambitious.

Concretely this NEW SHORTHAND PROPOSAL would look something like:

step_sleep: sleep 5s
step_update_service_up: set-sensor service.isUp = true
# also supporting a type, ie `set-sensor [TYPE] NAME = VALUE`, eg
step_update_service_up: set-sensor boolean service.isUp = true

You would still need the full map syntax whenever defining flow logic -- eg
condition, next, retry, or timeout -- or any property not supported by the
shorthand syntax.  But for the (majority?) simple cases the expression
would be very concise.  In most cases I think it would feel like a DSL but
has the virtue of a very clear translation to the actual workflow model and
the underlying (YAML) model needed for resumption and UI.

As a final example, the example used at the start of the proposal
(simplified a little -- removing on-error retry and env map as those
wouldn't be supported by shorthand):

brooklyn.initializers:
- type: workflow-effector
 name: run-spark-on-gcp
 steps:
   1:
      type: container
      image: my/google-cloud
      command: gcloud dataproc jobs submit spark
--BUCKET=gs://$brooklyn:config("bucket")
    2:
      type: set-sensor
      sensor: spark-output
      value: ${1.stdout}

Could be written in this shorthand as follows:

 steps:
   1: container my/google-cloud command "gcloud dataproc jobs submit spark
--BUCKET=gs://${entity.config.bucket}"
   2: set-sensor spark-output ${1.stdout}

Thoughts?

Best
Alex


[1] https://github.com/apache/brooklyn-server/pull/1358
[2]
https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.gbadaqa2yql6


On Wed, 7 Sept 2022 at 09:58, Alex Heneveld <al...@cloudsoft.io> wrote:

> Hi Peter,
>
> Yes - thanks for the extra details.  I did take your suggestion to be a
> procedural DSL not YAML, per the illustration at [1] (second code block).
> Probably where I was confusing was in saying that unlike DSLs which just
> run (and where the execution can be delegated to eg java/groovy/ruby), here
> we need to understand and display, store and resume the workflow progress.
> So I think it needs to be compiled to some representation that is well
> described and that new Apache Brooklyn code can reason about, both in the
> UI (JS) and backend (Java).  Parsing a DSL is much harder than using YAML
> for this "reasonable" representation (as in we can reason _about_ it :) ),
> because we already have good backend processing, persistence,
> serialization; and frontend processing and visualization support for
> YAML-based models.  So I think we almost definitely want a well-described
> declarative YAML model of the workflow.
>
> We might *also* want a Workflow DSL because I agree with you a DSL would
> be nicer for a user to write (if writing by hand; although if composing
> visually a drag-and-drop to YAML is probably easier).  However it should
> probably get "compiled" into a Workflow YAML.  So I'm suggesting we do the
> workflow YAML at this stage, and a DSL that compiles into that YAML can be
> designed later.  (Designing a good DSL and parser and reason-about-able
> representation is a big task, so being able to separate it feels good too!)
>
> Best
> Alex
>
> [1]
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.75wm48pjvx0h
>
>
> On Fri, 2 Sept 2022 at 20:17, Geoff Macartney <ge...@gmail.com>
> wrote:
>
>> Hi Peter,
>>
>> Thanks for such a detailed writeup of how you see this working. I fear
>> I've too little experience with this sort of thing to be able to say
>> anything very useful about it. My thought on the matter would be,
>> let's get started with the yaml based approach and see how it goes. I
>> think that experience would then give us a much better feel for what a
>> really nice and usable DSL for workflows would look like (probably to
>> address all the pain points of the yaml approach! :-)   The outline
>> above will then be a good starting point, I'm sure.
>>
>> Cheers
>> Geoff
>>
>> On Thu, 1 Sept 2022 at 21:26, Peter Abramowitsch
>> <pa...@gmail.com> wrote:
>> >
>> > Hi All
>> > I just wanted to clarify something in my comment the other day about
>> DSLs
>> > since I see that the acronym was also used in Alex's original document.
>> > Unless I misunderstood, Alex was proposing to create a DSL for Brooklyn
>> > using yaml as syntax and writing a code layer to translate between that
>> > syntax and underlying APIs which are presumably all in Java.
>> >
>> > What I was suggesting was a DSL written directly in  Java (I guess)
>> whose
>> > syntax would be that language, but whose grammar would be keywords that
>> > were also Java functions.  Some of these functions would be pre-defined
>> in
>> > the DSL, while others could be  defined by the user and could use other
>> > functions of the DSL.    The result would be turned into a JAR file (or
>> > equivalent in another platform)   But during the compile phase, it
>> would be
>> > checked for errors, and it could be debugged line by line either
>> invoking
>> > live functionality or using a library of mock versions of the Brooklyn
>> API.
>> >
>> > In this 'native' DSL one could provide different types of workflow
>> > constructs as functions (In the BaseClass), taking function names as
>> method
>> > pointers, or using Lambdas.  It would be a lot easier in Ruby or Python
>> >
>> > // linear
>> > brooklynRun(NamedTaskMethod, NamedTaskMethod)
>> >
>> > // chained
>> > TaskMethodA()TaskMethodB().
>> >
>> > // asynchronous
>> > brooklynJoin(NamedTaskMethod, NamedTaskMethod,...)
>> >
>> > // conditional
>> > brooklynRunIf(NamedTaskMethod, NamedConditionMethod,...)
>> >
>> > // iterative
>> > brooklynRunWhile(NamedTaskMethod, NamedConditionMethod,...)
>> > brooklynRunUntil(NamedTaskMethod, NamedConditionMethod,...)
>> >
>> > // there could even be a utility to implement legacy syntax (this of
>> course
>> > would require the extra code layer I was trying to avoid)
>> > runYaml(Path)
>> >
>> > A basic class structure might be
>> >
>> > // where BrooklynRecipeBase implements the utility functions including,
>> > among others  Join, Run, If, While, Until mentioned above
>> > // and the BrooklynWorkflowInterface would dictate the functional
>> > requirements for the mandatory aspects of the Recipe.
>> > class MyRecipe extends BrooklynRecipeBase implements,
>> > BrooklynWorkflowInterface
>> > {
>> > Initialize()
>> > createContext()   - spin up resources
>> > workflow() - the main launch sequence using aspects of the DSL
>> > monitoring() - an asynchronous workflow used to manage sensor output or
>> for
>> > whatever needs to be done while the "orchestra" is plating
>> > shutdownHook() - called whenever shutdown is happening
>> > }
>> >
>> > For those who don't like the smell of Java, the source file could just
>> be
>> > the contents, which would then be injected into the class framing code
>> > before compilation.
>> >
>> > These are just ideas.  I'm not familiar enough with Brooklyn in its
>> current
>> > implementation to be able to create realistic pseudocode.
>> >
>> > Peter
>> >
>> > On Thu, Sep 1, 2022 at 9:24 AM Geoff Macartney <
>> geoff.macartney@gmail.com>
>> > wrote:
>> >
>> > > Hi Alex,
>> > >
>> > > That's great, I'll be excited to hear all about it.  7th September
>> > > suits me fine; I would probably prefer 4.00 p.m. over 11.00.
>> > >
>> > > Cheers
>> > > Geoff
>> > >
>> > > On Thu, 1 Sept 2022 at 12:41, Alex Heneveld <al...@cloudsoft.io>
>> wrote:
>> > > >
>> > > > Thanks for the excellent feedback Geoff and yes there are some very
>> cool
>> > > and exciting things added recently -- containers, conditions, and
>> terraform
>> > > and kubernetes support, all of which make writing complex blueprints
>> much
>> > > easier.
>> > > >
>> > > > I'd love to host a session to showcase these.
>> > > >
>> > > > How does Wed 7 Sept sound?  I could do 11am UK or 4pm UK --
>> depending
>> > > what time suits for people who are interested.  Please RSVP and
>> indicate
>> > > your time preference!
>> > > >
>> > > > Best
>> > > > Alex
>> > > >
>> > > >
>> > > > On Wed, 31 Aug 2022 at 22:17, Geoff Macartney <
>> geoff.macartney@gmail.com>
>> > > wrote:
>> > > >>
>> > > >> Hi Alex,
>> > > >>
>> > > >> Another thought occurred to me when reading that workflow
>> proposal. You
>> > > wrote
>> > > >>
>> > > >> "and with the recent support for container-based tasks and
>> declarative
>> > > >> conditions, we have taken big steps towards enabling YAML
>> authorship"
>> > > >>
>> > > >> Unfortunately over the past while I haven't been able to keep up as
>> > > >> closely as I would like with developments in Brooklyn. I'm just
>> > > >> wondering if it might be possible to get together some time, on
>> Google
>> > > >> Meet or Zoom or whatnot, if you or a colleague could spare half an
>> > > >> hour to demo some of these recent developments? But don't worry
>> about
>> > > >> it if you're too busy at present.
>> > > >>
>> > > >> Adding dev@ to this in CC for the sake of Openness. Others might
>> also
>> > > >> be interested!
>> > > >>
>> > > >> Cheers
>> > > >> Geoff
>> > >
>>
>

Re: Declarative Workflow update & shorthand/DSL

Posted by Alex Heneveld <al...@cloudsoft.io>.

Thanks Geoff.  I've had informal feedback from others in favour of the list
approach, and in my trials it is working nice.  I will apply the changes in
the docs and PRs.

Two other things:

* Geoff's comment made me wonder about having a "function" step, where
within a workflow one could define a local function.  I'm thinking *not*
for just now, but FYI it would not be too hard to add.

* The semantics of referencing a sensor which isn't yet "ready" is
ambiguous.  Do we wait, or return null, or give an error.  By default it
blocks but this feels wrong.  I think we should (1) add a new `wait` step
which allows waiting for a value, and (2) give an error when used elsewhere
(which is also the behaviour if you reference a non-existent key in a map),
and (3) support as part of local variables (only) i.e. `let` some limited
evaluation, including the `??` nullish operator which forgives an error on
the LHS (and basic maths)

A bit more detail on that last, it would allow:

- let x = ${entity.sensor.DoesNotExist} ?? 0
- let x = x + 1

But it would give an error if you reference it in log or other commands:

- log the value is ${entity.sensor.DoesNotExist}

Best
Alex


On Wed, 21 Sept 2022 at 20:35, Geoff Macartney <ge...@apache.org> wrote:

> Hi Alex, Mykola,
>
> By the way I should mention that I'm very busy in the evenings this week so
> might not get to look at the latest PR for a while. By all means go ahead
> and merge it if Mykola and/or others are happy with it, no need to wait for
> me.
>
> Cheers
> Geoff
>
>
> On Tue, 20 Sept 2022, 22:20 Geoff Macartney, <ge...@apache.org> wrote:
>
> > Hi Alex,
> >
> > +1 This updated proposal looks good - I do think the list based
> > approach will be simpler and less error prone, and the fact that you
> > will support an optional `id` anyway, if that is desired, means it
> > retains much of the flexibility of the map based approach. The custom
> > workflow step looks a little like the "functions" that we discussed
> > previously. Putting this all together will be pretty powerful.
> >
> > Will try to get a look at the latest PR if I can.
> >
> > Cheers
> > Geoff
> >
> >
> > On Mon, 19 Sept 2022 at 17:31, Alex Heneveld <al...@cloudsoft.io> wrote:
> > >
> > > Geoff-  Thanks.  Comments addressed in #1361 along with a major
> addition
> > to
> > > support variables -- inputs/outputs/etc.
> > >
> > > All-  One of the points Geoff makes concerns how steps are defined.  I
> > > think along with other comments that tips the balance in favour of
> > > revisiting how steps are defined.
> > >
> > > I propose we switch from the OLD proposed approach -- the map of
> ordered
> > > IDs -- to a NEW LIST-BASED approach.  There's a lot of detail below but
> > > in-short it's shifting from:
> > >
> > > steps:
> > >   1-say-hi:  log hi
> > >   2-step-two:  log step 2
> > >
> > > To:
> > >
> > > steps:
> > >   - log hi
> > >   - log step 2
> > >
> > >
> > > Specifically, based on feedback and more hands-on experience, I
> propose:
> > >
> > > * steps now be supplied as a list (now a map)
> > > * users are no longer required to supply an ID for each step (in the
> old
> > > approach, the ID was required as the key for every step)
> > > * users can if they wish supply an ID for any step (now as an explicit
> > `id:
> > > <ID>` rule)
> > > * the default order, if no `next: <ID>` instruction is supplied, is the
> > > order of the list (in the old approach the order was based on the ID)
> > >
> > > Also, the shorthand idea has evolved a little bit; instead of a
> "<type>:
> > > <type-specific-shorthand-template>" single-key map, we've suggested:
> > >
> > > * it be a string "<type> <type-specific-shorthand-template>"
> > > * shorthand can also be supplied in a map using the key "s" or the key
> > > "shorthand" (to allow shorthand along with other step key values)
> > > * custom steps can define custom shorthand templates (e.g. "${key} "="
> > > ${value}")
> > > * (there is also some evolution in how custom steps are defined)
> > >
> > >
> > > To illustrate:
> > >
> > > The OLD EXAMPLE:
> > >
> > > steps:
> > >    1:
> > >       type: container
> > >       image: my/google-cloud
> > >       command: gcloud dataproc jobs submit spark
> --BUCKET=gs://${BUCKET}
> > >       env:
> > >         BUCKET: $brooklyn:config("bucket")
> > >       on-error: retry
> > >     2:
> > >       set-sensor: spark-output=${1.stdout}
> > >
> > > Would become in the NEW proposal:
> > >
> > > steps:
> > >     - type: container
> > >       image: my/google-cloud
> > >       command: gcloud dataproc jobs submit spark
> --BUCKET=gs://${BUCKET}
> > >       env:
> > >         BUCKET: $brooklyn:config("bucket")
> > >       on-error: retry
> > >     - set-sensor spark-output = ${1.stdout}
> > >
> > > If we wanted to attach an `id` to the second step (e.g. for use with
> > > "next") we could write it either as:
> > >
> > >     # full long-hand map
> > >     - type: set-sensor
> > >       input:
> > >         sensor: spark-output
> > >         value: ${1.stdout}
> > >       id: set-spark-output
> > >
> > >     # mixed "s" shorthand key and other fields
> > >     - s: set-sensor spark-output = ${1.stdout}
> > >       id: set-spark-output
> > >
> > > To explain the reasoning:
> > >
> > > The advantages of steps:
> > >
> > > * Slightly less verbose when no ID is needed on a step
> > > * Easier to read and understand flow
> > > * Avoids hassle of renumbering when introducing step
> > > * Avoids risk of error where same key defined multiple time
> > >
> > > The advantages of OLD map-based scheme (implied disadvantages of the
> new
> > > steps process):
> > >
> > > * Easier user-facing correlation on steps (e.g. in UI) by always having
> > an
> > > explicit ID for easier correlation
> > > * Easier to extend a workflow by inserting or overriding explicit steps
> > >
> > > After some initial usage of the workflow, it seems these advantages of
> > the
> > > old approach are outweighed by the advantages of the list approach.  In
> > > particular the "correlation" can be done in other ways, and extending a
> > > workflow is probably not so useful, whereas supplying and maintaining
> an
> > ID
> > > is a hassle, error-prone, and harder to understand.
> > >
> > > Finally to explain the custom steps idea, it works out nicely in the
> code
> > > and we think for users to add a "compound-step" to the catalog e.g. as
> > > follows for the workflow shown above:
> > >
> > >   id: retryable-gcloud-dataproc-with-bucket-and-sensor
> > >   item:
> > >     type: custom-workflow-step
> > >     parameters:
> > >       bucket:
> > >         type: string
> > >       sensor_name:
> > >         type: string
> > >         default: spark-output
> > >     shorthand_definition: [ " bucket " ${bucket} ] [ " sensor "
> > > ${sensor_name} ]
> > >     steps:
> > >     - type: container
> > >       image: my/google-cloud
> > >       command: gcloud dataproc jobs submit spark
> --BUCKET=gs://${BUCKET}
> > >       env:
> > >         BUCKET: ${bucket}
> > >       on-error: retry
> > >     - set-sensor ${sensor_name} = ${1.stdout}
> > >
> > > A user could then write a step:
> > >
> > > - retryable-gcloud-dataproc-with-bucket-and-sensor
> > >
> > > And optionally use the shorthand per the shorthand_definition, matching
> > the
> > > quoted string literals and inferring the indicated parameters, e.g.:
> > >
> > > - retryable-gcloud-dataproc-with-bucket-and-sensor bucket my-bucket
> > sensor
> > > my-spark-output
> > >
> > > They could of course also use the longhand:
> > >
> > > - type: retryable-gcloud-dataproc-with-bucket-and-sensor
> > >   input:
> > >     bucket: my-bucket
> > >     sensor_name: my-spark-output
> > >
> > >
> > > Best
> > > Alex
> > >
> > >
> > >
> > > On Sat, 17 Sept 2022 at 21:13, Geoff Macartney <ge...@apache.org>
> > wrote:
> > >
> > > > Hi Alex,
> > > >
> > > > Belatedly reviewed the PR. It's looking good! And surprisingly simple
> > > > in the end. Made a couple of minor comments on it.
> > > >
> > > > Cheers
> > > > Geoff
> > > >
> > > > On Thu, 8 Sept 2022 at 09:35, Alex Heneveld <al...@cloudsoft.io>
> wrote:
> > > > >
> > > > > Hi team,
> > > > >
> > > > > An initial PR with a few types and the ability to define an
> effector
> > is
> > > > > available [1].
> > > > >
> > > > > This is enough for the next steps to be parallelized, e.g. new
> steps
> > > > > added.  The proposal has been updated with a work plan / list of
> > tasks
> > > > > [2].  Any volunteers to help with some of the upcoming tasks let me
> > know.
> > > > >
> > > > > Finally I've been thinking about the "shorthand syntax" and how to
> > bring
> > > > us
> > > > > closer to Peter's proposal of a DSL.  The original proposal allowed
> > > > instead
> > > > > of a map e.g.
> > > > >
> > > > > step_sleep:
> > > > >   type: sleep
> > > > >   duration: 5s
> > > > >
> > > > > or
> > > > >
> > > > > step_update_service_up:
> > > > >   type: set-sensor
> > > > >   sensor:
> > > > >     name: service.isUp
> > > > >     type: boolean
> > > > >   value: true
> > > > >
> > > > > being able to use a shorthand _map_ with a single key being the
> > type, and
> > > > > value interpreted by that type, so in the OLD SHORTHAND PROPOSAL
> the
> > > > above
> > > > > could be written:
> > > > >
> > > > > step_sleep:
> > > > >   sleep: 5s
> > > > >
> > > > > step_update_service_up:
> > > > >   set-sensor: service.isUp = true
> > > > >
> > > > > Having played with syntaxes a bit I wonder if we should instead say
> > the
> > > > > shorthand DSL kicks in when the step _body_ is a string (instead
> of a
> > > > > single-key map), and the first word of the string being the type,
> > and the
> > > > > remainder interpreted by the type, and we allow it to be a bit more
> > > > > ambitious.
> > > > >
> > > > > Concretely this NEW SHORTHAND PROPOSAL would look something like:
> > > > >
> > > > > step_sleep: sleep 5s
> > > > > step_update_service_up: set-sensor service.isUp = true
> > > > > # also supporting a type, ie `set-sensor [TYPE] NAME = VALUE`, eg
> > > > > step_update_service_up: set-sensor boolean service.isUp = true
> > > > >
> > > > > You would still need the full map syntax whenever defining flow
> > logic --
> > > > eg
> > > > > condition, next, retry, or timeout -- or any property not supported
> > by
> > > > the
> > > > > shorthand syntax.  But for the (majority?) simple cases the
> > expression
> > > > > would be very concise.  In most cases I think it would feel like a
> > DSL
> > > > but
> > > > > has the virtue of a very clear translation to the actual workflow
> > model
> > > > and
> > > > > the underlying (YAML) model needed for resumption and UI.
> > > > >
> > > > > As a final example, the example used at the start of the proposal
> > > > > (simplified a little -- removing on-error retry and env map as
> those
> > > > > wouldn't be supported by shorthand):
> > > > >
> > > > > brooklyn.initializers:
> > > > > - type: workflow-effector
> > > > >  name: run-spark-on-gcp
> > > > >  steps:
> > > > >    1:
> > > > >       type: container
> > > > >       image: my/google-cloud
> > > > >       command: gcloud dataproc jobs submit spark
> > > > > --BUCKET=gs://$brooklyn:config("bucket")
> > > > >     2:
> > > > >       type: set-sensor
> > > > >       sensor: spark-output
> > > > >       value: ${1.stdout}
> > > > >
> > > > > Could be written in this shorthand as follows:
> > > > >
> > > > >  steps:
> > > > >    1: container my/google-cloud command "gcloud dataproc jobs
> submit
> > > > spark
> > > > > --BUCKET=gs://${entity.config.bucket}"
> > > > >    2: set-sensor spark-output ${1.stdout}
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > Best
> > > > > Alex
> > > > >
> > > > >
> > > > > [1] https://github.com/apache/brooklyn-server/pull/1358
> > > > > [2]
> > > > >
> > > >
> >
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.gbadaqa2yql6
> > > > >
> > > > >
> > > > > On Wed, 7 Sept 2022 at 09:58, Alex Heneveld <al...@cloudsoft.io>
> > wrote:
> > > > >
> > > > > > Hi Peter,
> > > > > >
> > > > > > Yes - thanks for the extra details.  I did take your suggestion
> to
> > be a
> > > > > > procedural DSL not YAML, per the illustration at [1] (second code
> > > > block).
> > > > > > Probably where I was confusing was in saying that unlike DSLs
> which
> > > > just
> > > > > > run (and where the execution can be delegated to eg
> > java/groovy/ruby),
> > > > here
> > > > > > we need to understand and display, store and resume the workflow
> > > > progress.
> > > > > > So I think it needs to be compiled to some representation that is
> > well
> > > > > > described and that new Apache Brooklyn code can reason about,
> both
> > in
> > > > the
> > > > > > UI (JS) and backend (Java).  Parsing a DSL is much harder than
> > using
> > > > YAML
> > > > > > for this "reasonable" representation (as in we can reason _about_
> > it
> > > > :) ),
> > > > > > because we already have good backend processing, persistence,
> > > > > > serialization; and frontend processing and visualization support
> > for
> > > > > > YAML-based models.  So I think we almost definitely want a
> > > > well-described
> > > > > > declarative YAML model of the workflow.
> > > > > >
> > > > > > We might *also* want a Workflow DSL because I agree with you a
> DSL
> > > > would
> > > > > > be nicer for a user to write (if writing by hand; although if
> > composing
> > > > > > visually a drag-and-drop to YAML is probably easier).  However it
> > > > should
> > > > > > probably get "compiled" into a Workflow YAML.  So I'm suggesting
> > we do
> > > > the
> > > > > > workflow YAML at this stage, and a DSL that compiles into that
> YAML
> > > > can be
> > > > > > designed later.  (Designing a good DSL and parser and
> > reason-about-able
> > > > > > representation is a big task, so being able to separate it feels
> > good
> > > > too!)
> > > > > >
> > > > > > Best
> > > > > > Alex
> > > > > >
> > > > > > [1]
> > > > > >
> > > >
> >
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.75wm48pjvx0h
> > > > > >
> > > > > >
> > > > > > On Fri, 2 Sept 2022 at 20:17, Geoff Macartney <
> > > > geoff.macartney@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > >> Hi Peter,
> > > > > >>
> > > > > >> Thanks for such a detailed writeup of how you see this working.
> I
> > fear
> > > > > >> I've too little experience with this sort of thing to be able to
> > say
> > > > > >> anything very useful about it. My thought on the matter would
> be,
> > > > > >> let's get started with the yaml based approach and see how it
> > goes. I
> > > > > >> think that experience would then give us a much better feel for
> > what a
> > > > > >> really nice and usable DSL for workflows would look like
> > (probably to
> > > > > >> address all the pain points of the yaml approach! :-)   The
> > outline
> > > > > >> above will then be a good starting point, I'm sure.
> > > > > >>
> > > > > >> Cheers
> > > > > >> Geoff
> > > > > >>
> > > > > >> On Thu, 1 Sept 2022 at 21:26, Peter Abramowitsch
> > > > > >> <pa...@gmail.com> wrote:
> > > > > >> >
> > > > > >> > Hi All
> > > > > >> > I just wanted to clarify something in my comment the other day
> > about
> > > > > >> DSLs
> > > > > >> > since I see that the acronym was also used in Alex's original
> > > > document.
> > > > > >> > Unless I misunderstood, Alex was proposing to create a DSL for
> > > > Brooklyn
> > > > > >> > using yaml as syntax and writing a code layer to translate
> > between
> > > > that
> > > > > >> > syntax and underlying APIs which are presumably all in Java.
> > > > > >> >
> > > > > >> > What I was suggesting was a DSL written directly in  Java (I
> > guess)
> > > > > >> whose
> > > > > >> > syntax would be that language, but whose grammar would be
> > keywords
> > > > that
> > > > > >> > were also Java functions.  Some of these functions would be
> > > > pre-defined
> > > > > >> in
> > > > > >> > the DSL, while others could be  defined by the user and could
> > use
> > > > other
> > > > > >> > functions of the DSL.    The result would be turned into a JAR
> > file
> > > > (or
> > > > > >> > equivalent in another platform)   But during the compile
> phase,
> > it
> > > > > >> would be
> > > > > >> > checked for errors, and it could be debugged line by line
> either
> > > > > >> invoking
> > > > > >> > live functionality or using a library of mock versions of the
> > > > Brooklyn
> > > > > >> API.
> > > > > >> >
> > > > > >> > In this 'native' DSL one could provide different types of
> > workflow
> > > > > >> > constructs as functions (In the BaseClass), taking function
> > names as
> > > > > >> method
> > > > > >> > pointers, or using Lambdas.  It would be a lot easier in Ruby
> or
> > > > Python
> > > > > >> >
> > > > > >> > // linear
> > > > > >> > brooklynRun(NamedTaskMethod, NamedTaskMethod)
> > > > > >> >
> > > > > >> > // chained
> > > > > >> > TaskMethodA()TaskMethodB().
> > > > > >> >
> > > > > >> > // asynchronous
> > > > > >> > brooklynJoin(NamedTaskMethod, NamedTaskMethod,...)
> > > > > >> >
> > > > > >> > // conditional
> > > > > >> > brooklynRunIf(NamedTaskMethod, NamedConditionMethod,...)
> > > > > >> >
> > > > > >> > // iterative
> > > > > >> > brooklynRunWhile(NamedTaskMethod, NamedConditionMethod,...)
> > > > > >> > brooklynRunUntil(NamedTaskMethod, NamedConditionMethod,...)
> > > > > >> >
> > > > > >> > // there could even be a utility to implement legacy syntax
> > (this of
> > > > > >> course
> > > > > >> > would require the extra code layer I was trying to avoid)
> > > > > >> > runYaml(Path)
> > > > > >> >
> > > > > >> > A basic class structure might be
> > > > > >> >
> > > > > >> > // where BrooklynRecipeBase implements the utility functions
> > > > including,
> > > > > >> > among others  Join, Run, If, While, Until mentioned above
> > > > > >> > // and the BrooklynWorkflowInterface would dictate the
> > functional
> > > > > >> > requirements for the mandatory aspects of the Recipe.
> > > > > >> > class MyRecipe extends BrooklynRecipeBase implements,
> > > > > >> > BrooklynWorkflowInterface
> > > > > >> > {
> > > > > >> > Initialize()
> > > > > >> > createContext()   - spin up resources
> > > > > >> > workflow() - the main launch sequence using aspects of the DSL
> > > > > >> > monitoring() - an asynchronous workflow used to manage sensor
> > > > output or
> > > > > >> for
> > > > > >> > whatever needs to be done while the "orchestra" is plating
> > > > > >> > shutdownHook() - called whenever shutdown is happening
> > > > > >> > }
> > > > > >> >
> > > > > >> > For those who don't like the smell of Java, the source file
> > could
> > > > just
> > > > > >> be
> > > > > >> > the contents, which would then be injected into the class
> > framing
> > > > code
> > > > > >> > before compilation.
> > > > > >> >
> > > > > >> > These are just ideas.  I'm not familiar enough with Brooklyn
> in
> > its
> > > > > >> current
> > > > > >> > implementation to be able to create realistic pseudocode.
> > > > > >> >
> > > > > >> > Peter
> > > > > >> >
> > > > > >> > On Thu, Sep 1, 2022 at 9:24 AM Geoff Macartney <
> > > > > >> geoff.macartney@gmail.com>
> > > > > >> > wrote:
> > > > > >> >
> > > > > >> > > Hi Alex,
> > > > > >> > >
> > > > > >> > > That's great, I'll be excited to hear all about it.  7th
> > September
> > > > > >> > > suits me fine; I would probably prefer 4.00 p.m. over 11.00.
> > > > > >> > >
> > > > > >> > > Cheers
> > > > > >> > > Geoff
> > > > > >> > >
> > > > > >> > > On Thu, 1 Sept 2022 at 12:41, Alex Heneveld <
> > alex@cloudsoft.io>
> > > > > >> wrote:
> > > > > >> > > >
> > > > > >> > > > Thanks for the excellent feedback Geoff and yes there are
> > some
> > > > very
> > > > > >> cool
> > > > > >> > > and exciting things added recently -- containers,
> conditions,
> > and
> > > > > >> terraform
> > > > > >> > > and kubernetes support, all of which make writing complex
> > > > blueprints
> > > > > >> much
> > > > > >> > > easier.
> > > > > >> > > >
> > > > > >> > > > I'd love to host a session to showcase these.
> > > > > >> > > >
> > > > > >> > > > How does Wed 7 Sept sound?  I could do 11am UK or 4pm UK
> --
> > > > > >> depending
> > > > > >> > > what time suits for people who are interested.  Please RSVP
> > and
> > > > > >> indicate
> > > > > >> > > your time preference!
> > > > > >> > > >
> > > > > >> > > > Best
> > > > > >> > > > Alex
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > > On Wed, 31 Aug 2022 at 22:17, Geoff Macartney <
> > > > > >> geoff.macartney@gmail.com>
> > > > > >> > > wrote:
> > > > > >> > > >>
> > > > > >> > > >> Hi Alex,
> > > > > >> > > >>
> > > > > >> > > >> Another thought occurred to me when reading that workflow
> > > > > >> proposal. You
> > > > > >> > > wrote
> > > > > >> > > >>
> > > > > >> > > >> "and with the recent support for container-based tasks
> and
> > > > > >> declarative
> > > > > >> > > >> conditions, we have taken big steps towards enabling YAML
> > > > > >> authorship"
> > > > > >> > > >>
> > > > > >> > > >> Unfortunately over the past while I haven't been able to
> > keep
> > > > up as
> > > > > >> > > >> closely as I would like with developments in Brooklyn.
> I'm
> > just
> > > > > >> > > >> wondering if it might be possible to get together some
> > time, on
> > > > > >> Google
> > > > > >> > > >> Meet or Zoom or whatnot, if you or a colleague could
> spare
> > > > half an
> > > > > >> > > >> hour to demo some of these recent developments? But don't
> > worry
> > > > > >> about
> > > > > >> > > >> it if you're too busy at present.
> > > > > >> > > >>
> > > > > >> > > >> Adding dev@ to this in CC for the sake of Openness.
> Others
> > > > might
> > > > > >> also
> > > > > >> > > >> be interested!
> > > > > >> > > >>
> > > > > >> > > >> Cheers
> > > > > >> > > >> Geoff
> > > > > >> > >
> > > > > >>
> > > > > >
> > > >
> >
>

Re: Declarative Workflow update & mutex

Posted by Geoff Macartney <ge...@apache.org>.

Hi Alex et al,

First of all, congrats on the implementation of the workflows so far,
I think you've made amazing progress in a short time.

The discussion of the mutex issues is pretty complex - hardly
surprising though, as dealing with concurrency is always hard. Here
are some thoughts.

Referring to your options:

(1) I have a feeling that just making set-sensor in some sense be
atomic is probably not enough to support a wide enough range of
use-cases. I'm also not sure that we would want to do any sort of
concurrency control "implicitly" - I think it is probably preferable
to have explicit declarations in the workflow, whatever form they
take. I think that will both make it easier to implement and, probably
more importantly, easier to understand for developers reading the
workflow code. Hence, I would avoid this option.

(2) I don't feel I understand the examples in here fully, but anyway -
I don't think I like the idea of pessimistic locking for the same sort
of reasons as (1) above. Better to be explicit - if there is no
declaration of concurrency control, then there is no concurrency
control. This also lets workflows be more lightweight when they don't
need locking. (Then again, won't they mostly need it? Hm, not sure.)

I don't really see how the "# mutex lock acquisition" example works.
Where does the lock arise? To put it another way, why does

set-sensor lock-for-count = ${workflow.id}

not simply declare an ordinary string-type sensor? Does the
"lock-for-" prefix implicitly create a lock type sensor?

In the "rock-solid" example, the whole "require" section on the
set-sensor step seems like it is making the code author responsible
for the low-level details of the control flow. This could be an easy
way for bugs to arise, say if the author forgot one of the 'any'
conditions. I could imagine that normally the requirement will match
the conditions you gave - the saved sensor lock should match the
workflow ID (or be absent, on first execution). Could this not be
enforced as the standard behaviour, so that users wouldn't have to
write the condition at all? Similarly the whole "on-error" clause
seems perhaps unnecessary - doesn't it just describe what will always
be the case if "replayable: yes"?

Another point relates to "clear-sensor". Could we avoid requiring this
from users? What if they forget to add it, will that leave the mutex
erroneously locked? Would there be any possiblity of regarding the
workflow as a "scope", such that from the point of acquiring the lock,
it remained locked for the rest of that scope, and was automatically
released at the end of that scope? (A kind of RAII, if you'll forgive
an allusion to C++.) That actually seems kind of like what you have
written in the first code example in point (3), so let's move on to
that.

(3) is what I like best, because it has that very explicit quality,
with the formal statement of a "lock". Declaring it like this is bound
to make implementation easier for us. It certainly makes it easier to
understand for users. The variant I like best is the one with the
workflow step type "lock:".

So I'm going to suggest that for the moment you just proceed with (3),
the "fairly simple high level mechanism". Let's provide that and give
it a bit of time to gain experience with it. If it turns out not to be
a rich enough model then maybe something lower level like (2) may be
needed; but I do think it is likely to be a more frequent source of
errors.

About "replayable". I agree this doesn't feel intuitive. Again, it
seems like mixing in a low level concept, and that it would be good if
there was some convention we could adopt such that users wouldn't have
to be explicit about what points are replayable and what not. However,
in this case I don't have any clear idea that any such convention is
possible. I rather fear not. Perhaps whether something should be
replayable or not is  a higher order consideration, not something we
could decide, and will have to be left up to the code author? I don't
have a good grasp of the issues here.

Whether any of this is of use I don't know, hopefully there are some
points worth a thought.

Cheers
Geoff

On Fri, 11 Nov 2022 at 13:41, Alex Heneveld <al...@cloudsoft.io> wrote:
>
> Hi team,
>
> Workflow is in a pretty good state -- nearly done mostly as per
> proposal, with nearly all step types, retries, docs [1], and integrated
> into the activities view in the inspector.  My feeling is that it radically
> transforms how easy it is to write sensors and effectors.
>
> Thanks to everyone for their reviews and feedback!
>
> The biggest change from the original proposal is a switch to the list
> syntax from the init.d syntax.  Extra thanks to those who agitated for that.
>
> The other really nice aspect is how the shorthand step notation functions
> as a DSL in simple cases so extra thanks for the suggestion to make it
> DSL-like.
>
> The two items remaining are nested workflows and controlling how long
> workflows are remembered.
>
>
> There is one new feature which seems to be needed, which I wanted to
> raise.  As the subject suggests, this is mutexes.  I had hoped we wouldn't
> need this but actually quite often you want to ensure no other workflows
> are conflicting.  Consider the simple case where you want to atomically
> increment a sensor:
>
> ```
> - let i = ${entity.sensor.count} ?? 0
> - let i = ${i} + 1
> - set-sensor count = ${i}
> ```
>
> Running this twice we'd hope to get count=2.  But if the runs are
> concurrent you won't.  So how can we ensure no other instances of the
> workflow are running concurrently?
>
> There are three options I see.
>
>
> (1) set-sensor allows arithmetic
>
> We could support arithmetic on set-sensor and require it to run atomically
> against that sensor.  For instance `set-sensor count =
> (${entity.sensor.count} ?? 0) + 1`.  We could fairly easily ensure the RHS
> is evaluated with the caller holding the lock on the sensor count.  However
> our arithmetic support is quite limited (we don't support grouping at
> present, so either you'd have to write `${entity.sensor.count} + 1 ?? 1` or
> we'd beef that up), and I think there is something nice about at present
> where arithmetic is only allowed on `let` so it is more inspectable.
>
>
> (2) set-sensor with mutex check
>
> We could introduce a check which is done while the lock on the sensor is
> held.  So you could check the sensor is unset before setting it, and fail
> if it isn't, or check the value is as expected.  You can then set up
> whatever retry behaviour you want in the usual way.  For instance:
>
> ```
> # pessimistic locking
> - let i = ${entity.sensor.count} ?? 0
> - let j = ${i} + 1
> - step: set-sensor count = ${j}
>   require: ${i}
>   on-error:
>   - goto start
> - clear-sensor lock-for-count
> ```
>
> ```
> # mutex lock acquisition
> - step: set-sensor lock-for-count = ${workflow.id}
>   require:
>     when: absent
>   on-error:
>   - retry backoff 50ms increasing 2x up to 5s
> - let i = ${entity.sensor.count} ?? 0
> - let i = ${i} + 1
> - set-sensor count = ${i}
> - clear-sensor lock-for-count
> ```
>
> (A couple subtleties for those of you new to the workflow conditions; they
> always have an implicit target depending on context, which for `require` we
> would make be the old sensor value; "when: absent" is a special predicate
> DSL keyword to say that a sensor is unavailable (you could also use `when:
> absent_or_null` or `when: falsy` or `not: { when: truthy }`.  Finally
> `require: ${i}` uses the fact that conditions default to being an equality
> check.  That call is equivalent to `require: { target:
> ${entity.sensor.count}, equals: ${i} }`.)
>
> The retry with backoff is pretty handy here.  But there is still one
> problem, in the lock acquisition case, if the workflow fails after step 1
> what will clear the lock?  (Pessimistic locking doesn't have this
> problem.). Happily we have an easy solution, because workflows were
> designed with recovery in mind.  If Brooklyn detects an interrupted
> workflow on startup, it will fail it with a "dangling workflow" exception,
> and you can attach recovery to it; specifying replay points and making
> steps idempotent.
>
> ```
> # rock-solid mutex lock acquisition
> steps:
> - step: set-sensor lock-for-count = ${workflow.id}
>   require:
>     any:
>     - when: absent
>     - equals: ${workflow.id}
>   on-error:
>   - retry backoff 50ms increasing 2x up to 5s
> - let i = ${entity.sensor.count} ?? 0
> - let i = ${i} + 1
> - step: set-sensor count = ${i}
>   replayable: yes
> - clear-sensor lock-for-count
> on-error:
>   - condition:
>       error-cause:
>         glob: *DanglingWorkflowException*
>     step: retry replay
> replayable: yes
> ```
>
> The `require` block now allows re-entrancy.  We rely on the fact that
> Brooklyn gives workflow instances a unique ID and on failover Dangling is
> thrown from an interrupted step preserving the workflow ID (but giving it a
> different task ID so replays can be distinguished, with support for this in
> the UI), and Brooklyn persistence handles election of a single primary with
> any demoted instance interrupting its tasks.  The workflow is replayable
> from the start, and on Dangling it replays.  Additionally we can replay
> from the `set-sensor` step which will use local copies of the workflow
> variable so if that step is interrupted and runs twice it will be
> idempotent.
>
>
> (3) explicit `lock` keyword on `workflow` step
>
> My feeling is that (2) is really powerful, and the pessimistic locking case
> is easy enough, but the mutex implementation is hard.  We could make it
> easy to opt-in to by allowing sub-workflow blocks to specify a mutex.
>
> ```
> - step: workflow lock incrementing-count
>   steps:
>     - let i = ${entity.sensor.count} ?? 0
>     - let i = ${i} + 1
>     - step: set-sensor count = ${i}
>       replayable: yes
> ```
>
> This would act exactly as the previous example, setting the workflow.id
> into a sensor ahead of time, allowing absent or current workflow id as the
> value for re-entrancy, retrying if locked, clearing it after, and handling
> the Dangling exception.  The only tiny differences are that the sensor it
> atomically sets and clears would be called something like
> `workflow-lock-incrementing-count`, and the steps would be running in a
> sub-workflow.
>
> In this example we still need to say that the final `set-sensor count` is a
> replay point, otherwise if Brooklyn were interrupted in the split-second
> after setting the sensor but before recording the fact that the step
> completed, it would retry from the start causing a slight chance the sensor
> increments by 2.  This isn't to do with the fact a mutex is wanted however,
> it's because fundamentally adding one is not an idempotent operation!
> Provided the steps are idempotent there would be no need for it.
>
> For instance to make sure that `apt` (or `dpkg` or `terraform` or any
> command which requires a lock) isn't invoked concurrently you could use it
> like this:
>
> ```
> type: workflow-effector
> name: apt-get
> parameters:
>   package:
>     description: package(s) to install
> steps:
> - step: workflow lock apt
>   steps:
>     - ssh sudo apt-get install ${package}
> ```
>
> Parallel invocations of effector `apt-get { package: xxx }` will share a
> mutex on sensor `workflow-lock-apt` ensuring they don't conflict.
>
> You could even define a new workflow step type (assuming "lock xxx" in the
> shorthand is a key `lock` on the workflow step):
>
> ```
> id: apt-get
> type: workflow
> lock: apt
> shorthand: ${package}
> parameters:
>   package:
>     description: package(s) to install
> steps:
> - step: workflow lock apt
>   steps:
>     - ssh sudo apt-get install ${package}
> ```
>
> With this in the catalog, you could in workflow have steps `apt-get xxx`
> which acquire a lock from Brooklyn before running so they don't fail if
> invoked in parallel.
>
> ---
>
> I lean towards providing (2) and (3).  It's actually fairly easy to
> implement given what we already have, and allows low-level control for
> special cases via (2) and a fairly simple high-level mechanism (3).
>
> Thoughts?
>
> On a related note, I'm not 100% happy with `replayable` as a keyword and
> its options.  The essential thing is to indicate which points in a workflow
> are safe to replay from if it is interrupted or fails.  We currently
> support "yes" and "no" to indicate if something is a replayable point,
> "true" and "false" as synonyms, and "on" and "off" to indicate that a step
> and all following ones are or are not replayable (unless overridden with
> replayable yes/no or until changed with another replayable on/off).  It
> defaults to "off".  I think it is logically a good solution, but it doesn't
> feel intuitive.  Alternative suggestions welcome!
>
>
> Cheers for your input!
> Alex
>
>
> [1]
> https://github.com/apache/brooklyn-docs/blob/master/guide/blueprints/workflow/index.md

Re: Declarative Workflow update, mutex, replay and retention

Posted by Geoff Macartney <ge...@gmail.com>.

Hi Alex,

I'll try to get a look at the doc soon. Just to check, did you get my
reply to your email on the workflow mutexes? Was there any mileage in
the thinking?

Cheers
Geoff

On Fri, 18 Nov 2022 at 13:44, Alex Heneveld <al...@cloudsoft.io> wrote:
>
> Hi team,
>
> I've got most of the proposed "lock" implementation completed, as discussed
> in the previous mail (below), PR to come shortly.  It works well, though
> there are a huge number of subtleties so test cases were a challenge; it
> makes it all the better than we provide this however, so workflow authors
> have a much easier time.  The biggest challenge was to make sure that if an
> author writes e.g. { type: workflow, lock: my-lock, on-error: [ retry ],
> steps: [ ... ] }`, if Brooklyn does a failover the retry (at the new
> server) preserves the lock ... but if there is no retry, or the retry
> fails, the lock is cleared.
>
> As part of this I've been mulling over "replayable"; as I mentioned below
> it was one aspect I'm not entirely sure of, and it's quite closely related
> to "expiration" which I think might be better described as "retention".  I
> think I've got a better way to handle those, and a tweak to error
> handling.  It's described in this section:
>
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.63aibdplmze
>
> There are two questions towards the end that I'd especially value input on.
>
> Thanks
> Alex
>
>
> On Fri, 11 Nov 2022 at 13:40, Alex Heneveld <al...@cloudsoft.io> wrote:
>
> > Hi team,
> >
> > Workflow is in a pretty good state -- nearly done mostly as per
> > proposal, with nearly all step types, retries, docs [1], and integrated
> > into the activities view in the inspector.  My feeling is that it radically
> > transforms how easy it is to write sensors and effectors.
> >
> > Thanks to everyone for their reviews and feedback!
> >
> > The biggest change from the original proposal is a switch to the list
> > syntax from the init.d syntax.  Extra thanks to those who agitated for that.
> >
> > The other really nice aspect is how the shorthand step notation functions
> > as a DSL in simple cases so extra thanks for the suggestion to make it
> > DSL-like.
> >
> > The two items remaining are nested workflows and controlling how long
> > workflows are remembered.
> >
> >
> > There is one new feature which seems to be needed, which I wanted to
> > raise.  As the subject suggests, this is mutexes.  I had hoped we wouldn't
> > need this but actually quite often you want to ensure no other workflows
> > are conflicting.  Consider the simple case where you want to atomically
> > increment a sensor:
> >
> > ```
> > - let i = ${entity.sensor.count} ?? 0
> > - let i = ${i} + 1
> > - set-sensor count = ${i}
> > ```
> >
> > Running this twice we'd hope to get count=2.  But if the runs are
> > concurrent you won't.  So how can we ensure no other instances of the
> > workflow are running concurrently?
> >
> > There are three options I see.
> >
> >
> > (1) set-sensor allows arithmetic
> >
> > We could support arithmetic on set-sensor and require it to run atomically
> > against that sensor.  For instance `set-sensor count =
> > (${entity.sensor.count} ?? 0) + 1`.  We could fairly easily ensure the RHS
> > is evaluated with the caller holding the lock on the sensor count.  However
> > our arithmetic support is quite limited (we don't support grouping at
> > present, so either you'd have to write `${entity.sensor.count} + 1 ?? 1` or
> > we'd beef that up), and I think there is something nice about at present
> > where arithmetic is only allowed on `let` so it is more inspectable.
> >
> >
> > (2) set-sensor with mutex check
> >
> > We could introduce a check which is done while the lock on the sensor is
> > held.  So you could check the sensor is unset before setting it, and fail
> > if it isn't, or check the value is as expected.  You can then set up
> > whatever retry behaviour you want in the usual way.  For instance:
> >
> > ```
> > # pessimistic locking
> > - let i = ${entity.sensor.count} ?? 0
> > - let j = ${i} + 1
> > - step: set-sensor count = ${j}
> >   require: ${i}
> >   on-error:
> >   - goto start
> > - clear-sensor lock-for-count
> > ```
> >
> > ```
> > # mutex lock acquisition
> > - step: set-sensor lock-for-count = ${workflow.id}
> >   require:
> >     when: absent
> >   on-error:
> >   - retry backoff 50ms increasing 2x up to 5s
> > - let i = ${entity.sensor.count} ?? 0
> > - let i = ${i} + 1
> > - set-sensor count = ${i}
> > - clear-sensor lock-for-count
> > ```
> >
> > (A couple subtleties for those of you new to the workflow conditions; they
> > always have an implicit target depending on context, which for `require` we
> > would make be the old sensor value; "when: absent" is a special predicate
> > DSL keyword to say that a sensor is unavailable (you could also use `when:
> > absent_or_null` or `when: falsy` or `not: { when: truthy }`.  Finally
> > `require: ${i}` uses the fact that conditions default to being an equality
> > check.  That call is equivalent to `require: { target:
> > ${entity.sensor.count}, equals: ${i} }`.)
> >
> > The retry with backoff is pretty handy here.  But there is still one
> > problem, in the lock acquisition case, if the workflow fails after step 1
> > what will clear the lock?  (Pessimistic locking doesn't have this
> > problem.). Happily we have an easy solution, because workflows were
> > designed with recovery in mind.  If Brooklyn detects an interrupted
> > workflow on startup, it will fail it with a "dangling workflow" exception,
> > and you can attach recovery to it; specifying replay points and making
> > steps idempotent.
> >
> > ```
> > # rock-solid mutex lock acquisition
> > steps:
> > - step: set-sensor lock-for-count = ${workflow.id}
> >   require:
> >     any:
> >     - when: absent
> >     - equals: ${workflow.id}
> >   on-error:
> >   - retry backoff 50ms increasing 2x up to 5s
> > - let i = ${entity.sensor.count} ?? 0
> > - let i = ${i} + 1
> > - step: set-sensor count = ${i}
> >   replayable: yes
> > - clear-sensor lock-for-count
> > on-error:
> >   - condition:
> >       error-cause:
> >         glob: *DanglingWorkflowException*
> >     step: retry replay
> > replayable: yes
> > ```
> >
> > The `require` block now allows re-entrancy.  We rely on the fact that
> > Brooklyn gives workflow instances a unique ID and on failover Dangling is
> > thrown from an interrupted step preserving the workflow ID (but giving it a
> > different task ID so replays can be distinguished, with support for this in
> > the UI), and Brooklyn persistence handles election of a single primary with
> > any demoted instance interrupting its tasks.  The workflow is replayable
> > from the start, and on Dangling it replays.  Additionally we can replay
> > from the `set-sensor` step which will use local copies of the workflow
> > variable so if that step is interrupted and runs twice it will be
> > idempotent.
> >
> >
> > (3) explicit `lock` keyword on `workflow` step
> >
> > My feeling is that (2) is really powerful, and the pessimistic locking
> > case is easy enough, but the mutex implementation is hard.  We could make
> > it easy to opt-in to by allowing sub-workflow blocks to specify a mutex.
> >
> > ```
> > - step: workflow lock incrementing-count
> >   steps:
> >     - let i = ${entity.sensor.count} ?? 0
> >     - let i = ${i} + 1
> >     - step: set-sensor count = ${i}
> >       replayable: yes
> > ```
> >
> > This would act exactly as the previous example, setting the workflow.id
> > into a sensor ahead of time, allowing absent or current workflow id as the
> > value for re-entrancy, retrying if locked, clearing it after, and handling
> > the Dangling exception.  The only tiny differences are that the sensor it
> > atomically sets and clears would be called something like
> > `workflow-lock-incrementing-count`, and the steps would be running in a
> > sub-workflow.
> >
> > In this example we still need to say that the final `set-sensor count` is
> > a replay point, otherwise if Brooklyn were interrupted in the split-second
> > after setting the sensor but before recording the fact that the step
> > completed, it would retry from the start causing a slight chance the sensor
> > increments by 2.  This isn't to do with the fact a mutex is wanted however,
> > it's because fundamentally adding one is not an idempotent operation!
> > Provided the steps are idempotent there would be no need for it.
> >
> > For instance to make sure that `apt` (or `dpkg` or `terraform` or any
> > command which requires a lock) isn't invoked concurrently you could use it
> > like this:
> >
> > ```
> > type: workflow-effector
> > name: apt-get
> > parameters:
> >   package:
> >     description: package(s) to install
> > steps:
> > - step: workflow lock apt
> >   steps:
> >     - ssh sudo apt-get install ${package}
> > ```
> >
> > Parallel invocations of effector `apt-get { package: xxx }` will share a
> > mutex on sensor `workflow-lock-apt` ensuring they don't conflict.
> >
> > You could even define a new workflow step type (assuming "lock xxx" in the
> > shorthand is a key `lock` on the workflow step):
> >
> > ```
> > id: apt-get
> > type: workflow
> > lock: apt
> > shorthand: ${package}
> > parameters:
> >   package:
> >     description: package(s) to install
> > steps:
> > - step: workflow lock apt
> >   steps:
> >     - ssh sudo apt-get install ${package}
> > ```
> >
> > With this in the catalog, you could in workflow have steps `apt-get xxx`
> > which acquire a lock from Brooklyn before running so they don't fail if
> > invoked in parallel.
> >
> > ---
> >
> > I lean towards providing (2) and (3).  It's actually fairly easy to
> > implement given what we already have, and allows low-level control for
> > special cases via (2) and a fairly simple high-level mechanism (3).
> >
> > Thoughts?
> >
> > On a related note, I'm not 100% happy with `replayable` as a keyword and
> > its options.  The essential thing is to indicate which points in a workflow
> > are safe to replay from if it is interrupted or fails.  We currently
> > support "yes" and "no" to indicate if something is a replayable point,
> > "true" and "false" as synonyms, and "on" and "off" to indicate that a step
> > and all following ones are or are not replayable (unless overridden with
> > replayable yes/no or until changed with another replayable on/off).  It
> > defaults to "off".  I think it is logically a good solution, but it doesn't
> > feel intuitive.  Alternative suggestions welcome!
> >
> >
> > Cheers for your input!
> > Alex
> >
> >
> > [1]
> > https://github.com/apache/brooklyn-docs/blob/master/guide/blueprints/workflow/index.md
> >

Re: Declarative Workflow update, mutex, replay and retention

Posted by Geoff Macartney <ge...@gmail.com>.

Hi Alex,

I'm afraid I find the updated doc quite confusing - I think it needs
work to clarify this new distinction between replayable and resumable,
which I don't understand. What's the difference between those
concepts? The first mention of "resumable" in the doc (at the start of
section "Common Replay Settings") is

"Most of the time, there are just a few tweaks to `resumable` and
`replayable` needed to let Apache Brooklyn do the right thing to
replay correctly."

But there has been no mention of "resumable" up to that point and no
explanation of what it means or where and how it is used.

Additionally in the section "Advanced: Replay/Resume Settings", under
the bullet for Replayable is the option "automatically"; but in the
2nd bullet where it documents "from here" it talks about "resumable:
automatically". Which is it? Or is it both?

In the example on p 31. it comments "step type is not resumable so
replay will not be permitted here" and so sets "replayable: no".  Is a
separate concept needed here?

There are also a couple of mentions of something called "retry
replay". The first, (p. 26) talks about "a `retry replay` step",
without really defining it, the second, in the explanation of (step
level) "replayable: from here", talks about "any 'retry replay' or
'resumable: automatically' handler", again without explanation.

Up until now I felt I sort of understood the proposals, but I now feel
I'm lost, and that the document has become quite hard to follow.

At the very least we are going to need to make sure that the
documentation that we put into brooklyn-docs for this is very
carefully and thoughtfully written. Workflows are going to be very
powerful, but if they're not well documented they're going to be very
hard to understand. It might turn out you're the only person who knows
how to write them!

Cheers,
Geoff

On Mon, 21 Nov 2022 at 15:48, Alex Heneveld <al...@cloudsoft.io> wrote:
>
> Following some feedback re "replayable" I've rejigged that section [1].  It
> changes the concepts to consider whether steps are "resumable" as a
> separate idea to noting explicit "replayable" waypoints, with in most cases
> either able to allow a workflow to be replayed, e.g. if a resumable step is
> interrupted (such as a sleep, but not for instance and http call) or if the
> workflow author indicated that a completed step was "replayable here".
>
> Thanks to those who gave their input!  I am much happier with this rejigged
> plan.  More comments are welcome!
>
> Best
> Alex
>
> [1]
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.63aibdplmze
>
>
> On Fri, 18 Nov 2022 at 13:44, Alex Heneveld <al...@cloudsoft.io> wrote:
>
> > Hi team,
> >
> > I've got most of the proposed "lock" implementation completed, as
> > discussed in the previous mail (below), PR to come shortly.  It works well,
> > though there are a huge number of subtleties so test cases were a
> > challenge; it makes it all the better than we provide this however, so
> > workflow authors have a much easier time.  The biggest challenge was to
> > make sure that if an author writes e.g. { type: workflow, lock: my-lock,
> > on-error: [ retry ], steps: [ ... ] }`, if Brooklyn does a failover the
> > retry (at the new server) preserves the lock ... but if there is no retry,
> > or the retry fails, the lock is cleared.
> >
> > As part of this I've been mulling over "replayable"; as I mentioned below
> > it was one aspect I'm not entirely sure of, and it's quite closely related
> > to "expiration" which I think might be better described as "retention".  I
> > think I've got a better way to handle those, and a tweak to error
> > handling.  It's described in this section:
> >
> >
> > https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.63aibdplmze
> >
> > There are two questions towards the end that I'd especially value input on.
> >
> > Thanks
> > Alex
> >
> >
> > On Fri, 11 Nov 2022 at 13:40, Alex Heneveld <al...@cloudsoft.io> wrote:
> >
> >> Hi team,
> >>
> >> Workflow is in a pretty good state -- nearly done mostly as per
> >> proposal, with nearly all step types, retries, docs [1], and integrated
> >> into the activities view in the inspector.  My feeling is that it radically
> >> transforms how easy it is to write sensors and effectors.
> >>
> >> Thanks to everyone for their reviews and feedback!
> >>
> >> The biggest change from the original proposal is a switch to the list
> >> syntax from the init.d syntax.  Extra thanks to those who agitated for that.
> >>
> >> The other really nice aspect is how the shorthand step notation functions
> >> as a DSL in simple cases so extra thanks for the suggestion to make it
> >> DSL-like.
> >>
> >> The two items remaining are nested workflows and controlling how long
> >> workflows are remembered.
> >>
> >>
> >> There is one new feature which seems to be needed, which I wanted to
> >> raise.  As the subject suggests, this is mutexes.  I had hoped we wouldn't
> >> need this but actually quite often you want to ensure no other workflows
> >> are conflicting.  Consider the simple case where you want to atomically
> >> increment a sensor:
> >>
> >> ```
> >> - let i = ${entity.sensor.count} ?? 0
> >> - let i = ${i} + 1
> >> - set-sensor count = ${i}
> >> ```
> >>
> >> Running this twice we'd hope to get count=2.  But if the runs are
> >> concurrent you won't.  So how can we ensure no other instances of the
> >> workflow are running concurrently?
> >>
> >> There are three options I see.
> >>
> >>
> >> (1) set-sensor allows arithmetic
> >>
> >> We could support arithmetic on set-sensor and require it to run
> >> atomically against that sensor.  For instance `set-sensor count =
> >> (${entity.sensor.count} ?? 0) + 1`.  We could fairly easily ensure the RHS
> >> is evaluated with the caller holding the lock on the sensor count.  However
> >> our arithmetic support is quite limited (we don't support grouping at
> >> present, so either you'd have to write `${entity.sensor.count} + 1 ?? 1` or
> >> we'd beef that up), and I think there is something nice about at present
> >> where arithmetic is only allowed on `let` so it is more inspectable.
> >>
> >>
> >> (2) set-sensor with mutex check
> >>
> >> We could introduce a check which is done while the lock on the sensor is
> >> held.  So you could check the sensor is unset before setting it, and fail
> >> if it isn't, or check the value is as expected.  You can then set up
> >> whatever retry behaviour you want in the usual way.  For instance:
> >>
> >> ```
> >> # pessimistic locking
> >> - let i = ${entity.sensor.count} ?? 0
> >> - let j = ${i} + 1
> >> - step: set-sensor count = ${j}
> >>   require: ${i}
> >>   on-error:
> >>   - goto start
> >> - clear-sensor lock-for-count
> >> ```
> >>
> >> ```
> >> # mutex lock acquisition
> >> - step: set-sensor lock-for-count = ${workflow.id}
> >>   require:
> >>     when: absent
> >>   on-error:
> >>   - retry backoff 50ms increasing 2x up to 5s
> >> - let i = ${entity.sensor.count} ?? 0
> >> - let i = ${i} + 1
> >> - set-sensor count = ${i}
> >> - clear-sensor lock-for-count
> >> ```
> >>
> >> (A couple subtleties for those of you new to the workflow conditions;
> >> they always have an implicit target depending on context, which for
> >> `require` we would make be the old sensor value; "when: absent" is a
> >> special predicate DSL keyword to say that a sensor is unavailable (you
> >> could also use `when: absent_or_null` or `when: falsy` or `not: { when:
> >> truthy }`.  Finally `require: ${i}` uses the fact that conditions default
> >> to being an equality check.  That call is equivalent to `require: { target:
> >> ${entity.sensor.count}, equals: ${i} }`.)
> >>
> >> The retry with backoff is pretty handy here.  But there is still one
> >> problem, in the lock acquisition case, if the workflow fails after step 1
> >> what will clear the lock?  (Pessimistic locking doesn't have this
> >> problem.). Happily we have an easy solution, because workflows were
> >> designed with recovery in mind.  If Brooklyn detects an interrupted
> >> workflow on startup, it will fail it with a "dangling workflow" exception,
> >> and you can attach recovery to it; specifying replay points and making
> >> steps idempotent.
> >>
> >> ```
> >> # rock-solid mutex lock acquisition
> >> steps:
> >> - step: set-sensor lock-for-count = ${workflow.id}
> >>   require:
> >>     any:
> >>     - when: absent
> >>     - equals: ${workflow.id}
> >>   on-error:
> >>   - retry backoff 50ms increasing 2x up to 5s
> >> - let i = ${entity.sensor.count} ?? 0
> >> - let i = ${i} + 1
> >> - step: set-sensor count = ${i}
> >>   replayable: yes
> >> - clear-sensor lock-for-count
> >> on-error:
> >>   - condition:
> >>       error-cause:
> >>         glob: *DanglingWorkflowException*
> >>     step: retry replay
> >> replayable: yes
> >> ```
> >>
> >> The `require` block now allows re-entrancy.  We rely on the fact that
> >> Brooklyn gives workflow instances a unique ID and on failover Dangling is
> >> thrown from an interrupted step preserving the workflow ID (but giving it a
> >> different task ID so replays can be distinguished, with support for this in
> >> the UI), and Brooklyn persistence handles election of a single primary with
> >> any demoted instance interrupting its tasks.  The workflow is replayable
> >> from the start, and on Dangling it replays.  Additionally we can replay
> >> from the `set-sensor` step which will use local copies of the workflow
> >> variable so if that step is interrupted and runs twice it will be
> >> idempotent.
> >>
> >>
> >> (3) explicit `lock` keyword on `workflow` step
> >>
> >> My feeling is that (2) is really powerful, and the pessimistic locking
> >> case is easy enough, but the mutex implementation is hard.  We could make
> >> it easy to opt-in to by allowing sub-workflow blocks to specify a mutex.
> >>
> >> ```
> >> - step: workflow lock incrementing-count
> >>   steps:
> >>     - let i = ${entity.sensor.count} ?? 0
> >>     - let i = ${i} + 1
> >>     - step: set-sensor count = ${i}
> >>       replayable: yes
> >> ```
> >>
> >> This would act exactly as the previous example, setting the workflow.id
> >> into a sensor ahead of time, allowing absent or current workflow id as the
> >> value for re-entrancy, retrying if locked, clearing it after, and handling
> >> the Dangling exception.  The only tiny differences are that the sensor it
> >> atomically sets and clears would be called something like
> >> `workflow-lock-incrementing-count`, and the steps would be running in a
> >> sub-workflow.
> >>
> >> In this example we still need to say that the final `set-sensor count` is
> >> a replay point, otherwise if Brooklyn were interrupted in the split-second
> >> after setting the sensor but before recording the fact that the step
> >> completed, it would retry from the start causing a slight chance the sensor
> >> increments by 2.  This isn't to do with the fact a mutex is wanted however,
> >> it's because fundamentally adding one is not an idempotent operation!
> >> Provided the steps are idempotent there would be no need for it.
> >>
> >> For instance to make sure that `apt` (or `dpkg` or `terraform` or any
> >> command which requires a lock) isn't invoked concurrently you could use it
> >> like this:
> >>
> >> ```
> >> type: workflow-effector
> >> name: apt-get
> >> parameters:
> >>   package:
> >>     description: package(s) to install
> >> steps:
> >> - step: workflow lock apt
> >>   steps:
> >>     - ssh sudo apt-get install ${package}
> >> ```
> >>
> >> Parallel invocations of effector `apt-get { package: xxx }` will share a
> >> mutex on sensor `workflow-lock-apt` ensuring they don't conflict.
> >>
> >> You could even define a new workflow step type (assuming "lock xxx" in
> >> the shorthand is a key `lock` on the workflow step):
> >>
> >> ```
> >> id: apt-get
> >> type: workflow
> >> lock: apt
> >> shorthand: ${package}
> >> parameters:
> >>   package:
> >>     description: package(s) to install
> >> steps:
> >> - step: workflow lock apt
> >>   steps:
> >>     - ssh sudo apt-get install ${package}
> >> ```
> >>
> >> With this in the catalog, you could in workflow have steps `apt-get xxx`
> >> which acquire a lock from Brooklyn before running so they don't fail if
> >> invoked in parallel.
> >>
> >> ---
> >>
> >> I lean towards providing (2) and (3).  It's actually fairly easy to
> >> implement given what we already have, and allows low-level control for
> >> special cases via (2) and a fairly simple high-level mechanism (3).
> >>
> >> Thoughts?
> >>
> >> On a related note, I'm not 100% happy with `replayable` as a keyword and
> >> its options.  The essential thing is to indicate which points in a workflow
> >> are safe to replay from if it is interrupted or fails.  We currently
> >> support "yes" and "no" to indicate if something is a replayable point,
> >> "true" and "false" as synonyms, and "on" and "off" to indicate that a step
> >> and all following ones are or are not replayable (unless overridden with
> >> replayable yes/no or until changed with another replayable on/off).  It
> >> defaults to "off".  I think it is logically a good solution, but it doesn't
> >> feel intuitive.  Alternative suggestions welcome!
> >>
> >>
> >> Cheers for your input!
> >> Alex
> >>
> >>
> >> [1]
> >> https://github.com/apache/brooklyn-docs/blob/master/guide/blueprints/workflow/index.md
> >>
> >

Re: Declarative Workflow update, mutex, replay and retention

Posted by Alex Heneveld <al...@cloudsoft.io>.

Following some feedback re "replayable" I've rejigged that section [1].  It
changes the concepts to consider whether steps are "resumable" as a
separate idea to noting explicit "replayable" waypoints, with in most cases
either able to allow a workflow to be replayed, e.g. if a resumable step is
interrupted (such as a sleep, but not for instance and http call) or if the
workflow author indicated that a completed step was "replayable here".

Thanks to those who gave their input!  I am much happier with this rejigged
plan.  More comments are welcome!

Best
Alex

[1]
https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.63aibdplmze


On Fri, 18 Nov 2022 at 13:44, Alex Heneveld <al...@cloudsoft.io> wrote:

> Hi team,
>
> I've got most of the proposed "lock" implementation completed, as
> discussed in the previous mail (below), PR to come shortly.  It works well,
> though there are a huge number of subtleties so test cases were a
> challenge; it makes it all the better than we provide this however, so
> workflow authors have a much easier time.  The biggest challenge was to
> make sure that if an author writes e.g. { type: workflow, lock: my-lock,
> on-error: [ retry ], steps: [ ... ] }`, if Brooklyn does a failover the
> retry (at the new server) preserves the lock ... but if there is no retry,
> or the retry fails, the lock is cleared.
>
> As part of this I've been mulling over "replayable"; as I mentioned below
> it was one aspect I'm not entirely sure of, and it's quite closely related
> to "expiration" which I think might be better described as "retention".  I
> think I've got a better way to handle those, and a tweak to error
> handling.  It's described in this section:
>
>
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.63aibdplmze
>
> There are two questions towards the end that I'd especially value input on.
>
> Thanks
> Alex
>
>
> On Fri, 11 Nov 2022 at 13:40, Alex Heneveld <al...@cloudsoft.io> wrote:
>
>> Hi team,
>>
>> Workflow is in a pretty good state -- nearly done mostly as per
>> proposal, with nearly all step types, retries, docs [1], and integrated
>> into the activities view in the inspector.  My feeling is that it radically
>> transforms how easy it is to write sensors and effectors.
>>
>> Thanks to everyone for their reviews and feedback!
>>
>> The biggest change from the original proposal is a switch to the list
>> syntax from the init.d syntax.  Extra thanks to those who agitated for that.
>>
>> The other really nice aspect is how the shorthand step notation functions
>> as a DSL in simple cases so extra thanks for the suggestion to make it
>> DSL-like.
>>
>> The two items remaining are nested workflows and controlling how long
>> workflows are remembered.
>>
>>
>> There is one new feature which seems to be needed, which I wanted to
>> raise.  As the subject suggests, this is mutexes.  I had hoped we wouldn't
>> need this but actually quite often you want to ensure no other workflows
>> are conflicting.  Consider the simple case where you want to atomically
>> increment a sensor:
>>
>> ```
>> - let i = ${entity.sensor.count} ?? 0
>> - let i = ${i} + 1
>> - set-sensor count = ${i}
>> ```
>>
>> Running this twice we'd hope to get count=2.  But if the runs are
>> concurrent you won't.  So how can we ensure no other instances of the
>> workflow are running concurrently?
>>
>> There are three options I see.
>>
>>
>> (1) set-sensor allows arithmetic
>>
>> We could support arithmetic on set-sensor and require it to run
>> atomically against that sensor.  For instance `set-sensor count =
>> (${entity.sensor.count} ?? 0) + 1`.  We could fairly easily ensure the RHS
>> is evaluated with the caller holding the lock on the sensor count.  However
>> our arithmetic support is quite limited (we don't support grouping at
>> present, so either you'd have to write `${entity.sensor.count} + 1 ?? 1` or
>> we'd beef that up), and I think there is something nice about at present
>> where arithmetic is only allowed on `let` so it is more inspectable.
>>
>>
>> (2) set-sensor with mutex check
>>
>> We could introduce a check which is done while the lock on the sensor is
>> held.  So you could check the sensor is unset before setting it, and fail
>> if it isn't, or check the value is as expected.  You can then set up
>> whatever retry behaviour you want in the usual way.  For instance:
>>
>> ```
>> # pessimistic locking
>> - let i = ${entity.sensor.count} ?? 0
>> - let j = ${i} + 1
>> - step: set-sensor count = ${j}
>>   require: ${i}
>>   on-error:
>>   - goto start
>> - clear-sensor lock-for-count
>> ```
>>
>> ```
>> # mutex lock acquisition
>> - step: set-sensor lock-for-count = ${workflow.id}
>>   require:
>>     when: absent
>>   on-error:
>>   - retry backoff 50ms increasing 2x up to 5s
>> - let i = ${entity.sensor.count} ?? 0
>> - let i = ${i} + 1
>> - set-sensor count = ${i}
>> - clear-sensor lock-for-count
>> ```
>>
>> (A couple subtleties for those of you new to the workflow conditions;
>> they always have an implicit target depending on context, which for
>> `require` we would make be the old sensor value; "when: absent" is a
>> special predicate DSL keyword to say that a sensor is unavailable (you
>> could also use `when: absent_or_null` or `when: falsy` or `not: { when:
>> truthy }`.  Finally `require: ${i}` uses the fact that conditions default
>> to being an equality check.  That call is equivalent to `require: { target:
>> ${entity.sensor.count}, equals: ${i} }`.)
>>
>> The retry with backoff is pretty handy here.  But there is still one
>> problem, in the lock acquisition case, if the workflow fails after step 1
>> what will clear the lock?  (Pessimistic locking doesn't have this
>> problem.). Happily we have an easy solution, because workflows were
>> designed with recovery in mind.  If Brooklyn detects an interrupted
>> workflow on startup, it will fail it with a "dangling workflow" exception,
>> and you can attach recovery to it; specifying replay points and making
>> steps idempotent.
>>
>> ```
>> # rock-solid mutex lock acquisition
>> steps:
>> - step: set-sensor lock-for-count = ${workflow.id}
>>   require:
>>     any:
>>     - when: absent
>>     - equals: ${workflow.id}
>>   on-error:
>>   - retry backoff 50ms increasing 2x up to 5s
>> - let i = ${entity.sensor.count} ?? 0
>> - let i = ${i} + 1
>> - step: set-sensor count = ${i}
>>   replayable: yes
>> - clear-sensor lock-for-count
>> on-error:
>>   - condition:
>>       error-cause:
>>         glob: *DanglingWorkflowException*
>>     step: retry replay
>> replayable: yes
>> ```
>>
>> The `require` block now allows re-entrancy.  We rely on the fact that
>> Brooklyn gives workflow instances a unique ID and on failover Dangling is
>> thrown from an interrupted step preserving the workflow ID (but giving it a
>> different task ID so replays can be distinguished, with support for this in
>> the UI), and Brooklyn persistence handles election of a single primary with
>> any demoted instance interrupting its tasks.  The workflow is replayable
>> from the start, and on Dangling it replays.  Additionally we can replay
>> from the `set-sensor` step which will use local copies of the workflow
>> variable so if that step is interrupted and runs twice it will be
>> idempotent.
>>
>>
>> (3) explicit `lock` keyword on `workflow` step
>>
>> My feeling is that (2) is really powerful, and the pessimistic locking
>> case is easy enough, but the mutex implementation is hard.  We could make
>> it easy to opt-in to by allowing sub-workflow blocks to specify a mutex.
>>
>> ```
>> - step: workflow lock incrementing-count
>>   steps:
>>     - let i = ${entity.sensor.count} ?? 0
>>     - let i = ${i} + 1
>>     - step: set-sensor count = ${i}
>>       replayable: yes
>> ```
>>
>> This would act exactly as the previous example, setting the workflow.id
>> into a sensor ahead of time, allowing absent or current workflow id as the
>> value for re-entrancy, retrying if locked, clearing it after, and handling
>> the Dangling exception.  The only tiny differences are that the sensor it
>> atomically sets and clears would be called something like
>> `workflow-lock-incrementing-count`, and the steps would be running in a
>> sub-workflow.
>>
>> In this example we still need to say that the final `set-sensor count` is
>> a replay point, otherwise if Brooklyn were interrupted in the split-second
>> after setting the sensor but before recording the fact that the step
>> completed, it would retry from the start causing a slight chance the sensor
>> increments by 2.  This isn't to do with the fact a mutex is wanted however,
>> it's because fundamentally adding one is not an idempotent operation!
>> Provided the steps are idempotent there would be no need for it.
>>
>> For instance to make sure that `apt` (or `dpkg` or `terraform` or any
>> command which requires a lock) isn't invoked concurrently you could use it
>> like this:
>>
>> ```
>> type: workflow-effector
>> name: apt-get
>> parameters:
>>   package:
>>     description: package(s) to install
>> steps:
>> - step: workflow lock apt
>>   steps:
>>     - ssh sudo apt-get install ${package}
>> ```
>>
>> Parallel invocations of effector `apt-get { package: xxx }` will share a
>> mutex on sensor `workflow-lock-apt` ensuring they don't conflict.
>>
>> You could even define a new workflow step type (assuming "lock xxx" in
>> the shorthand is a key `lock` on the workflow step):
>>
>> ```
>> id: apt-get
>> type: workflow
>> lock: apt
>> shorthand: ${package}
>> parameters:
>>   package:
>>     description: package(s) to install
>> steps:
>> - step: workflow lock apt
>>   steps:
>>     - ssh sudo apt-get install ${package}
>> ```
>>
>> With this in the catalog, you could in workflow have steps `apt-get xxx`
>> which acquire a lock from Brooklyn before running so they don't fail if
>> invoked in parallel.
>>
>> ---
>>
>> I lean towards providing (2) and (3).  It's actually fairly easy to
>> implement given what we already have, and allows low-level control for
>> special cases via (2) and a fairly simple high-level mechanism (3).
>>
>> Thoughts?
>>
>> On a related note, I'm not 100% happy with `replayable` as a keyword and
>> its options.  The essential thing is to indicate which points in a workflow
>> are safe to replay from if it is interrupted or fails.  We currently
>> support "yes" and "no" to indicate if something is a replayable point,
>> "true" and "false" as synonyms, and "on" and "off" to indicate that a step
>> and all following ones are or are not replayable (unless overridden with
>> replayable yes/no or until changed with another replayable on/off).  It
>> defaults to "off".  I think it is logically a good solution, but it doesn't
>> feel intuitive.  Alternative suggestions welcome!
>>
>>
>> Cheers for your input!
>> Alex
>>
>>
>> [1]
>> https://github.com/apache/brooklyn-docs/blob/master/guide/blueprints/workflow/index.md
>>
>

Re: Declarative Workflow update, mutex, replay and retention

Posted by Alex Heneveld <al...@cloudsoft.io>.

Hi team,

I've got most of the proposed "lock" implementation completed, as discussed
in the previous mail (below), PR to come shortly.  It works well, though
there are a huge number of subtleties so test cases were a challenge; it
makes it all the better than we provide this however, so workflow authors
have a much easier time.  The biggest challenge was to make sure that if an
author writes e.g. { type: workflow, lock: my-lock, on-error: [ retry ],
steps: [ ... ] }`, if Brooklyn does a failover the retry (at the new
server) preserves the lock ... but if there is no retry, or the retry
fails, the lock is cleared.

As part of this I've been mulling over "replayable"; as I mentioned below
it was one aspect I'm not entirely sure of, and it's quite closely related
to "expiration" which I think might be better described as "retention".  I
think I've got a better way to handle those, and a tweak to error
handling.  It's described in this section:

https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.63aibdplmze

There are two questions towards the end that I'd especially value input on.

Thanks
Alex


On Fri, 11 Nov 2022 at 13:40, Alex Heneveld <al...@cloudsoft.io> wrote:

> Hi team,
>
> Workflow is in a pretty good state -- nearly done mostly as per
> proposal, with nearly all step types, retries, docs [1], and integrated
> into the activities view in the inspector.  My feeling is that it radically
> transforms how easy it is to write sensors and effectors.
>
> Thanks to everyone for their reviews and feedback!
>
> The biggest change from the original proposal is a switch to the list
> syntax from the init.d syntax.  Extra thanks to those who agitated for that.
>
> The other really nice aspect is how the shorthand step notation functions
> as a DSL in simple cases so extra thanks for the suggestion to make it
> DSL-like.
>
> The two items remaining are nested workflows and controlling how long
> workflows are remembered.
>
>
> There is one new feature which seems to be needed, which I wanted to
> raise.  As the subject suggests, this is mutexes.  I had hoped we wouldn't
> need this but actually quite often you want to ensure no other workflows
> are conflicting.  Consider the simple case where you want to atomically
> increment a sensor:
>
> ```
> - let i = ${entity.sensor.count} ?? 0
> - let i = ${i} + 1
> - set-sensor count = ${i}
> ```
>
> Running this twice we'd hope to get count=2.  But if the runs are
> concurrent you won't.  So how can we ensure no other instances of the
> workflow are running concurrently?
>
> There are three options I see.
>
>
> (1) set-sensor allows arithmetic
>
> We could support arithmetic on set-sensor and require it to run atomically
> against that sensor.  For instance `set-sensor count =
> (${entity.sensor.count} ?? 0) + 1`.  We could fairly easily ensure the RHS
> is evaluated with the caller holding the lock on the sensor count.  However
> our arithmetic support is quite limited (we don't support grouping at
> present, so either you'd have to write `${entity.sensor.count} + 1 ?? 1` or
> we'd beef that up), and I think there is something nice about at present
> where arithmetic is only allowed on `let` so it is more inspectable.
>
>
> (2) set-sensor with mutex check
>
> We could introduce a check which is done while the lock on the sensor is
> held.  So you could check the sensor is unset before setting it, and fail
> if it isn't, or check the value is as expected.  You can then set up
> whatever retry behaviour you want in the usual way.  For instance:
>
> ```
> # pessimistic locking
> - let i = ${entity.sensor.count} ?? 0
> - let j = ${i} + 1
> - step: set-sensor count = ${j}
>   require: ${i}
>   on-error:
>   - goto start
> - clear-sensor lock-for-count
> ```
>
> ```
> # mutex lock acquisition
> - step: set-sensor lock-for-count = ${workflow.id}
>   require:
>     when: absent
>   on-error:
>   - retry backoff 50ms increasing 2x up to 5s
> - let i = ${entity.sensor.count} ?? 0
> - let i = ${i} + 1
> - set-sensor count = ${i}
> - clear-sensor lock-for-count
> ```
>
> (A couple subtleties for those of you new to the workflow conditions; they
> always have an implicit target depending on context, which for `require` we
> would make be the old sensor value; "when: absent" is a special predicate
> DSL keyword to say that a sensor is unavailable (you could also use `when:
> absent_or_null` or `when: falsy` or `not: { when: truthy }`.  Finally
> `require: ${i}` uses the fact that conditions default to being an equality
> check.  That call is equivalent to `require: { target:
> ${entity.sensor.count}, equals: ${i} }`.)
>
> The retry with backoff is pretty handy here.  But there is still one
> problem, in the lock acquisition case, if the workflow fails after step 1
> what will clear the lock?  (Pessimistic locking doesn't have this
> problem.). Happily we have an easy solution, because workflows were
> designed with recovery in mind.  If Brooklyn detects an interrupted
> workflow on startup, it will fail it with a "dangling workflow" exception,
> and you can attach recovery to it; specifying replay points and making
> steps idempotent.
>
> ```
> # rock-solid mutex lock acquisition
> steps:
> - step: set-sensor lock-for-count = ${workflow.id}
>   require:
>     any:
>     - when: absent
>     - equals: ${workflow.id}
>   on-error:
>   - retry backoff 50ms increasing 2x up to 5s
> - let i = ${entity.sensor.count} ?? 0
> - let i = ${i} + 1
> - step: set-sensor count = ${i}
>   replayable: yes
> - clear-sensor lock-for-count
> on-error:
>   - condition:
>       error-cause:
>         glob: *DanglingWorkflowException*
>     step: retry replay
> replayable: yes
> ```
>
> The `require` block now allows re-entrancy.  We rely on the fact that
> Brooklyn gives workflow instances a unique ID and on failover Dangling is
> thrown from an interrupted step preserving the workflow ID (but giving it a
> different task ID so replays can be distinguished, with support for this in
> the UI), and Brooklyn persistence handles election of a single primary with
> any demoted instance interrupting its tasks.  The workflow is replayable
> from the start, and on Dangling it replays.  Additionally we can replay
> from the `set-sensor` step which will use local copies of the workflow
> variable so if that step is interrupted and runs twice it will be
> idempotent.
>
>
> (3) explicit `lock` keyword on `workflow` step
>
> My feeling is that (2) is really powerful, and the pessimistic locking
> case is easy enough, but the mutex implementation is hard.  We could make
> it easy to opt-in to by allowing sub-workflow blocks to specify a mutex.
>
> ```
> - step: workflow lock incrementing-count
>   steps:
>     - let i = ${entity.sensor.count} ?? 0
>     - let i = ${i} + 1
>     - step: set-sensor count = ${i}
>       replayable: yes
> ```
>
> This would act exactly as the previous example, setting the workflow.id
> into a sensor ahead of time, allowing absent or current workflow id as the
> value for re-entrancy, retrying if locked, clearing it after, and handling
> the Dangling exception.  The only tiny differences are that the sensor it
> atomically sets and clears would be called something like
> `workflow-lock-incrementing-count`, and the steps would be running in a
> sub-workflow.
>
> In this example we still need to say that the final `set-sensor count` is
> a replay point, otherwise if Brooklyn were interrupted in the split-second
> after setting the sensor but before recording the fact that the step
> completed, it would retry from the start causing a slight chance the sensor
> increments by 2.  This isn't to do with the fact a mutex is wanted however,
> it's because fundamentally adding one is not an idempotent operation!
> Provided the steps are idempotent there would be no need for it.
>
> For instance to make sure that `apt` (or `dpkg` or `terraform` or any
> command which requires a lock) isn't invoked concurrently you could use it
> like this:
>
> ```
> type: workflow-effector
> name: apt-get
> parameters:
>   package:
>     description: package(s) to install
> steps:
> - step: workflow lock apt
>   steps:
>     - ssh sudo apt-get install ${package}
> ```
>
> Parallel invocations of effector `apt-get { package: xxx }` will share a
> mutex on sensor `workflow-lock-apt` ensuring they don't conflict.
>
> You could even define a new workflow step type (assuming "lock xxx" in the
> shorthand is a key `lock` on the workflow step):
>
> ```
> id: apt-get
> type: workflow
> lock: apt
> shorthand: ${package}
> parameters:
>   package:
>     description: package(s) to install
> steps:
> - step: workflow lock apt
>   steps:
>     - ssh sudo apt-get install ${package}
> ```
>
> With this in the catalog, you could in workflow have steps `apt-get xxx`
> which acquire a lock from Brooklyn before running so they don't fail if
> invoked in parallel.
>
> ---
>
> I lean towards providing (2) and (3).  It's actually fairly easy to
> implement given what we already have, and allows low-level control for
> special cases via (2) and a fairly simple high-level mechanism (3).
>
> Thoughts?
>
> On a related note, I'm not 100% happy with `replayable` as a keyword and
> its options.  The essential thing is to indicate which points in a workflow
> are safe to replay from if it is interrupted or fails.  We currently
> support "yes" and "no" to indicate if something is a replayable point,
> "true" and "false" as synonyms, and "on" and "off" to indicate that a step
> and all following ones are or are not replayable (unless overridden with
> replayable yes/no or until changed with another replayable on/off).  It
> defaults to "off".  I think it is logically a good solution, but it doesn't
> feel intuitive.  Alternative suggestions welcome!
>
>
> Cheers for your input!
> Alex
>
>
> [1]
> https://github.com/apache/brooklyn-docs/blob/master/guide/blueprints/workflow/index.md
>

Declarative Workflow update & mutex

Posted by Alex Heneveld <al...@cloudsoft.io>.

Hi team,

Workflow is in a pretty good state -- nearly done mostly as per
proposal, with nearly all step types, retries, docs [1], and integrated
into the activities view in the inspector.  My feeling is that it radically
transforms how easy it is to write sensors and effectors.

Thanks to everyone for their reviews and feedback!

The biggest change from the original proposal is a switch to the list
syntax from the init.d syntax.  Extra thanks to those who agitated for that.

The other really nice aspect is how the shorthand step notation functions
as a DSL in simple cases so extra thanks for the suggestion to make it
DSL-like.

The two items remaining are nested workflows and controlling how long
workflows are remembered.


There is one new feature which seems to be needed, which I wanted to
raise.  As the subject suggests, this is mutexes.  I had hoped we wouldn't
need this but actually quite often you want to ensure no other workflows
are conflicting.  Consider the simple case where you want to atomically
increment a sensor:

```
- let i = ${entity.sensor.count} ?? 0
- let i = ${i} + 1
- set-sensor count = ${i}
```

Running this twice we'd hope to get count=2.  But if the runs are
concurrent you won't.  So how can we ensure no other instances of the
workflow are running concurrently?

There are three options I see.


(1) set-sensor allows arithmetic

We could support arithmetic on set-sensor and require it to run atomically
against that sensor.  For instance `set-sensor count =
(${entity.sensor.count} ?? 0) + 1`.  We could fairly easily ensure the RHS
is evaluated with the caller holding the lock on the sensor count.  However
our arithmetic support is quite limited (we don't support grouping at
present, so either you'd have to write `${entity.sensor.count} + 1 ?? 1` or
we'd beef that up), and I think there is something nice about at present
where arithmetic is only allowed on `let` so it is more inspectable.


(2) set-sensor with mutex check

We could introduce a check which is done while the lock on the sensor is
held.  So you could check the sensor is unset before setting it, and fail
if it isn't, or check the value is as expected.  You can then set up
whatever retry behaviour you want in the usual way.  For instance:

```
# pessimistic locking
- let i = ${entity.sensor.count} ?? 0
- let j = ${i} + 1
- step: set-sensor count = ${j}
  require: ${i}
  on-error:
  - goto start
- clear-sensor lock-for-count
```

```
# mutex lock acquisition
- step: set-sensor lock-for-count = ${workflow.id}
  require:
    when: absent
  on-error:
  - retry backoff 50ms increasing 2x up to 5s
- let i = ${entity.sensor.count} ?? 0
- let i = ${i} + 1
- set-sensor count = ${i}
- clear-sensor lock-for-count
```

(A couple subtleties for those of you new to the workflow conditions; they
always have an implicit target depending on context, which for `require` we
would make be the old sensor value; "when: absent" is a special predicate
DSL keyword to say that a sensor is unavailable (you could also use `when:
absent_or_null` or `when: falsy` or `not: { when: truthy }`.  Finally
`require: ${i}` uses the fact that conditions default to being an equality
check.  That call is equivalent to `require: { target:
${entity.sensor.count}, equals: ${i} }`.)

The retry with backoff is pretty handy here.  But there is still one
problem, in the lock acquisition case, if the workflow fails after step 1
what will clear the lock?  (Pessimistic locking doesn't have this
problem.). Happily we have an easy solution, because workflows were
designed with recovery in mind.  If Brooklyn detects an interrupted
workflow on startup, it will fail it with a "dangling workflow" exception,
and you can attach recovery to it; specifying replay points and making
steps idempotent.

```
# rock-solid mutex lock acquisition
steps:
- step: set-sensor lock-for-count = ${workflow.id}
  require:
    any:
    - when: absent
    - equals: ${workflow.id}
  on-error:
  - retry backoff 50ms increasing 2x up to 5s
- let i = ${entity.sensor.count} ?? 0
- let i = ${i} + 1
- step: set-sensor count = ${i}
  replayable: yes
- clear-sensor lock-for-count
on-error:
  - condition:
      error-cause:
        glob: *DanglingWorkflowException*
    step: retry replay
replayable: yes
```

The `require` block now allows re-entrancy.  We rely on the fact that
Brooklyn gives workflow instances a unique ID and on failover Dangling is
thrown from an interrupted step preserving the workflow ID (but giving it a
different task ID so replays can be distinguished, with support for this in
the UI), and Brooklyn persistence handles election of a single primary with
any demoted instance interrupting its tasks.  The workflow is replayable
from the start, and on Dangling it replays.  Additionally we can replay
from the `set-sensor` step which will use local copies of the workflow
variable so if that step is interrupted and runs twice it will be
idempotent.


(3) explicit `lock` keyword on `workflow` step

My feeling is that (2) is really powerful, and the pessimistic locking case
is easy enough, but the mutex implementation is hard.  We could make it
easy to opt-in to by allowing sub-workflow blocks to specify a mutex.

```
- step: workflow lock incrementing-count
  steps:
    - let i = ${entity.sensor.count} ?? 0
    - let i = ${i} + 1
    - step: set-sensor count = ${i}
      replayable: yes
```

This would act exactly as the previous example, setting the workflow.id
into a sensor ahead of time, allowing absent or current workflow id as the
value for re-entrancy, retrying if locked, clearing it after, and handling
the Dangling exception.  The only tiny differences are that the sensor it
atomically sets and clears would be called something like
`workflow-lock-incrementing-count`, and the steps would be running in a
sub-workflow.

In this example we still need to say that the final `set-sensor count` is a
replay point, otherwise if Brooklyn were interrupted in the split-second
after setting the sensor but before recording the fact that the step
completed, it would retry from the start causing a slight chance the sensor
increments by 2.  This isn't to do with the fact a mutex is wanted however,
it's because fundamentally adding one is not an idempotent operation!
Provided the steps are idempotent there would be no need for it.

For instance to make sure that `apt` (or `dpkg` or `terraform` or any
command which requires a lock) isn't invoked concurrently you could use it
like this:

```
type: workflow-effector
name: apt-get
parameters:
  package:
    description: package(s) to install
steps:
- step: workflow lock apt
  steps:
    - ssh sudo apt-get install ${package}
```

Parallel invocations of effector `apt-get { package: xxx }` will share a
mutex on sensor `workflow-lock-apt` ensuring they don't conflict.

You could even define a new workflow step type (assuming "lock xxx" in the
shorthand is a key `lock` on the workflow step):

```
id: apt-get
type: workflow
lock: apt
shorthand: ${package}
parameters:
  package:
    description: package(s) to install
steps:
- step: workflow lock apt
  steps:
    - ssh sudo apt-get install ${package}
```

With this in the catalog, you could in workflow have steps `apt-get xxx`
which acquire a lock from Brooklyn before running so they don't fail if
invoked in parallel.

---

I lean towards providing (2) and (3).  It's actually fairly easy to
implement given what we already have, and allows low-level control for
special cases via (2) and a fairly simple high-level mechanism (3).

Thoughts?

On a related note, I'm not 100% happy with `replayable` as a keyword and
its options.  The essential thing is to indicate which points in a workflow
are safe to replay from if it is interrupted or fails.  We currently
support "yes" and "no" to indicate if something is a replayable point,
"true" and "false" as synonyms, and "on" and "off" to indicate that a step
and all following ones are or are not replayable (unless overridden with
replayable yes/no or until changed with another replayable on/off).  It
defaults to "off".  I think it is logically a good solution, but it doesn't
feel intuitive.  Alternative suggestions welcome!


Cheers for your input!
Alex


[1]
https://github.com/apache/brooklyn-docs/blob/master/guide/blueprints/workflow/index.md

Re: Declarative Workflow update & shorthand/DSL

Posted by Geoff Macartney <ge...@apache.org>.

Hi Alex, Mykola,

By the way I should mention that I'm very busy in the evenings this week so
might not get to look at the latest PR for a while. By all means go ahead
and merge it if Mykola and/or others are happy with it, no need to wait for
me.

Cheers
Geoff


On Tue, 20 Sept 2022, 22:20 Geoff Macartney, <ge...@apache.org> wrote:

> Hi Alex,
>
> +1 This updated proposal looks good - I do think the list based
> approach will be simpler and less error prone, and the fact that you
> will support an optional `id` anyway, if that is desired, means it
> retains much of the flexibility of the map based approach. The custom
> workflow step looks a little like the "functions" that we discussed
> previously. Putting this all together will be pretty powerful.
>
> Will try to get a look at the latest PR if I can.
>
> Cheers
> Geoff
>
>
> On Mon, 19 Sept 2022 at 17:31, Alex Heneveld <al...@cloudsoft.io> wrote:
> >
> > Geoff-  Thanks.  Comments addressed in #1361 along with a major addition
> to
> > support variables -- inputs/outputs/etc.
> >
> > All-  One of the points Geoff makes concerns how steps are defined.  I
> > think along with other comments that tips the balance in favour of
> > revisiting how steps are defined.
> >
> > I propose we switch from the OLD proposed approach -- the map of ordered
> > IDs -- to a NEW LIST-BASED approach.  There's a lot of detail below but
> > in-short it's shifting from:
> >
> > steps:
> >   1-say-hi:  log hi
> >   2-step-two:  log step 2
> >
> > To:
> >
> > steps:
> >   - log hi
> >   - log step 2
> >
> >
> > Specifically, based on feedback and more hands-on experience, I propose:
> >
> > * steps now be supplied as a list (now a map)
> > * users are no longer required to supply an ID for each step (in the old
> > approach, the ID was required as the key for every step)
> > * users can if they wish supply an ID for any step (now as an explicit
> `id:
> > <ID>` rule)
> > * the default order, if no `next: <ID>` instruction is supplied, is the
> > order of the list (in the old approach the order was based on the ID)
> >
> > Also, the shorthand idea has evolved a little bit; instead of a "<type>:
> > <type-specific-shorthand-template>" single-key map, we've suggested:
> >
> > * it be a string "<type> <type-specific-shorthand-template>"
> > * shorthand can also be supplied in a map using the key "s" or the key
> > "shorthand" (to allow shorthand along with other step key values)
> > * custom steps can define custom shorthand templates (e.g. "${key} "="
> > ${value}")
> > * (there is also some evolution in how custom steps are defined)
> >
> >
> > To illustrate:
> >
> > The OLD EXAMPLE:
> >
> > steps:
> >    1:
> >       type: container
> >       image: my/google-cloud
> >       command: gcloud dataproc jobs submit spark --BUCKET=gs://${BUCKET}
> >       env:
> >         BUCKET: $brooklyn:config("bucket")
> >       on-error: retry
> >     2:
> >       set-sensor: spark-output=${1.stdout}
> >
> > Would become in the NEW proposal:
> >
> > steps:
> >     - type: container
> >       image: my/google-cloud
> >       command: gcloud dataproc jobs submit spark --BUCKET=gs://${BUCKET}
> >       env:
> >         BUCKET: $brooklyn:config("bucket")
> >       on-error: retry
> >     - set-sensor spark-output = ${1.stdout}
> >
> > If we wanted to attach an `id` to the second step (e.g. for use with
> > "next") we could write it either as:
> >
> >     # full long-hand map
> >     - type: set-sensor
> >       input:
> >         sensor: spark-output
> >         value: ${1.stdout}
> >       id: set-spark-output
> >
> >     # mixed "s" shorthand key and other fields
> >     - s: set-sensor spark-output = ${1.stdout}
> >       id: set-spark-output
> >
> > To explain the reasoning:
> >
> > The advantages of steps:
> >
> > * Slightly less verbose when no ID is needed on a step
> > * Easier to read and understand flow
> > * Avoids hassle of renumbering when introducing step
> > * Avoids risk of error where same key defined multiple time
> >
> > The advantages of OLD map-based scheme (implied disadvantages of the new
> > steps process):
> >
> > * Easier user-facing correlation on steps (e.g. in UI) by always having
> an
> > explicit ID for easier correlation
> > * Easier to extend a workflow by inserting or overriding explicit steps
> >
> > After some initial usage of the workflow, it seems these advantages of
> the
> > old approach are outweighed by the advantages of the list approach.  In
> > particular the "correlation" can be done in other ways, and extending a
> > workflow is probably not so useful, whereas supplying and maintaining an
> ID
> > is a hassle, error-prone, and harder to understand.
> >
> > Finally to explain the custom steps idea, it works out nicely in the code
> > and we think for users to add a "compound-step" to the catalog e.g. as
> > follows for the workflow shown above:
> >
> >   id: retryable-gcloud-dataproc-with-bucket-and-sensor
> >   item:
> >     type: custom-workflow-step
> >     parameters:
> >       bucket:
> >         type: string
> >       sensor_name:
> >         type: string
> >         default: spark-output
> >     shorthand_definition: [ " bucket " ${bucket} ] [ " sensor "
> > ${sensor_name} ]
> >     steps:
> >     - type: container
> >       image: my/google-cloud
> >       command: gcloud dataproc jobs submit spark --BUCKET=gs://${BUCKET}
> >       env:
> >         BUCKET: ${bucket}
> >       on-error: retry
> >     - set-sensor ${sensor_name} = ${1.stdout}
> >
> > A user could then write a step:
> >
> > - retryable-gcloud-dataproc-with-bucket-and-sensor
> >
> > And optionally use the shorthand per the shorthand_definition, matching
> the
> > quoted string literals and inferring the indicated parameters, e.g.:
> >
> > - retryable-gcloud-dataproc-with-bucket-and-sensor bucket my-bucket
> sensor
> > my-spark-output
> >
> > They could of course also use the longhand:
> >
> > - type: retryable-gcloud-dataproc-with-bucket-and-sensor
> >   input:
> >     bucket: my-bucket
> >     sensor_name: my-spark-output
> >
> >
> > Best
> > Alex
> >
> >
> >
> > On Sat, 17 Sept 2022 at 21:13, Geoff Macartney <ge...@apache.org>
> wrote:
> >
> > > Hi Alex,
> > >
> > > Belatedly reviewed the PR. It's looking good! And surprisingly simple
> > > in the end. Made a couple of minor comments on it.
> > >
> > > Cheers
> > > Geoff
> > >
> > > On Thu, 8 Sept 2022 at 09:35, Alex Heneveld <al...@cloudsoft.io> wrote:
> > > >
> > > > Hi team,
> > > >
> > > > An initial PR with a few types and the ability to define an effector
> is
> > > > available [1].
> > > >
> > > > This is enough for the next steps to be parallelized, e.g. new steps
> > > > added.  The proposal has been updated with a work plan / list of
> tasks
> > > > [2].  Any volunteers to help with some of the upcoming tasks let me
> know.
> > > >
> > > > Finally I've been thinking about the "shorthand syntax" and how to
> bring
> > > us
> > > > closer to Peter's proposal of a DSL.  The original proposal allowed
> > > instead
> > > > of a map e.g.
> > > >
> > > > step_sleep:
> > > >   type: sleep
> > > >   duration: 5s
> > > >
> > > > or
> > > >
> > > > step_update_service_up:
> > > >   type: set-sensor
> > > >   sensor:
> > > >     name: service.isUp
> > > >     type: boolean
> > > >   value: true
> > > >
> > > > being able to use a shorthand _map_ with a single key being the
> type, and
> > > > value interpreted by that type, so in the OLD SHORTHAND PROPOSAL the
> > > above
> > > > could be written:
> > > >
> > > > step_sleep:
> > > >   sleep: 5s
> > > >
> > > > step_update_service_up:
> > > >   set-sensor: service.isUp = true
> > > >
> > > > Having played with syntaxes a bit I wonder if we should instead say
> the
> > > > shorthand DSL kicks in when the step _body_ is a string (instead of a
> > > > single-key map), and the first word of the string being the type,
> and the
> > > > remainder interpreted by the type, and we allow it to be a bit more
> > > > ambitious.
> > > >
> > > > Concretely this NEW SHORTHAND PROPOSAL would look something like:
> > > >
> > > > step_sleep: sleep 5s
> > > > step_update_service_up: set-sensor service.isUp = true
> > > > # also supporting a type, ie `set-sensor [TYPE] NAME = VALUE`, eg
> > > > step_update_service_up: set-sensor boolean service.isUp = true
> > > >
> > > > You would still need the full map syntax whenever defining flow
> logic --
> > > eg
> > > > condition, next, retry, or timeout -- or any property not supported
> by
> > > the
> > > > shorthand syntax.  But for the (majority?) simple cases the
> expression
> > > > would be very concise.  In most cases I think it would feel like a
> DSL
> > > but
> > > > has the virtue of a very clear translation to the actual workflow
> model
> > > and
> > > > the underlying (YAML) model needed for resumption and UI.
> > > >
> > > > As a final example, the example used at the start of the proposal
> > > > (simplified a little -- removing on-error retry and env map as those
> > > > wouldn't be supported by shorthand):
> > > >
> > > > brooklyn.initializers:
> > > > - type: workflow-effector
> > > >  name: run-spark-on-gcp
> > > >  steps:
> > > >    1:
> > > >       type: container
> > > >       image: my/google-cloud
> > > >       command: gcloud dataproc jobs submit spark
> > > > --BUCKET=gs://$brooklyn:config("bucket")
> > > >     2:
> > > >       type: set-sensor
> > > >       sensor: spark-output
> > > >       value: ${1.stdout}
> > > >
> > > > Could be written in this shorthand as follows:
> > > >
> > > >  steps:
> > > >    1: container my/google-cloud command "gcloud dataproc jobs submit
> > > spark
> > > > --BUCKET=gs://${entity.config.bucket}"
> > > >    2: set-sensor spark-output ${1.stdout}
> > > >
> > > > Thoughts?
> > > >
> > > > Best
> > > > Alex
> > > >
> > > >
> > > > [1] https://github.com/apache/brooklyn-server/pull/1358
> > > > [2]
> > > >
> > >
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.gbadaqa2yql6
> > > >
> > > >
> > > > On Wed, 7 Sept 2022 at 09:58, Alex Heneveld <al...@cloudsoft.io>
> wrote:
> > > >
> > > > > Hi Peter,
> > > > >
> > > > > Yes - thanks for the extra details.  I did take your suggestion to
> be a
> > > > > procedural DSL not YAML, per the illustration at [1] (second code
> > > block).
> > > > > Probably where I was confusing was in saying that unlike DSLs which
> > > just
> > > > > run (and where the execution can be delegated to eg
> java/groovy/ruby),
> > > here
> > > > > we need to understand and display, store and resume the workflow
> > > progress.
> > > > > So I think it needs to be compiled to some representation that is
> well
> > > > > described and that new Apache Brooklyn code can reason about, both
> in
> > > the
> > > > > UI (JS) and backend (Java).  Parsing a DSL is much harder than
> using
> > > YAML
> > > > > for this "reasonable" representation (as in we can reason _about_
> it
> > > :) ),
> > > > > because we already have good backend processing, persistence,
> > > > > serialization; and frontend processing and visualization support
> for
> > > > > YAML-based models.  So I think we almost definitely want a
> > > well-described
> > > > > declarative YAML model of the workflow.
> > > > >
> > > > > We might *also* want a Workflow DSL because I agree with you a DSL
> > > would
> > > > > be nicer for a user to write (if writing by hand; although if
> composing
> > > > > visually a drag-and-drop to YAML is probably easier).  However it
> > > should
> > > > > probably get "compiled" into a Workflow YAML.  So I'm suggesting
> we do
> > > the
> > > > > workflow YAML at this stage, and a DSL that compiles into that YAML
> > > can be
> > > > > designed later.  (Designing a good DSL and parser and
> reason-about-able
> > > > > representation is a big task, so being able to separate it feels
> good
> > > too!)
> > > > >
> > > > > Best
> > > > > Alex
> > > > >
> > > > > [1]
> > > > >
> > >
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.75wm48pjvx0h
> > > > >
> > > > >
> > > > > On Fri, 2 Sept 2022 at 20:17, Geoff Macartney <
> > > geoff.macartney@gmail.com>
> > > > > wrote:
> > > > >
> > > > >> Hi Peter,
> > > > >>
> > > > >> Thanks for such a detailed writeup of how you see this working. I
> fear
> > > > >> I've too little experience with this sort of thing to be able to
> say
> > > > >> anything very useful about it. My thought on the matter would be,
> > > > >> let's get started with the yaml based approach and see how it
> goes. I
> > > > >> think that experience would then give us a much better feel for
> what a
> > > > >> really nice and usable DSL for workflows would look like
> (probably to
> > > > >> address all the pain points of the yaml approach! :-)   The
> outline
> > > > >> above will then be a good starting point, I'm sure.
> > > > >>
> > > > >> Cheers
> > > > >> Geoff
> > > > >>
> > > > >> On Thu, 1 Sept 2022 at 21:26, Peter Abramowitsch
> > > > >> <pa...@gmail.com> wrote:
> > > > >> >
> > > > >> > Hi All
> > > > >> > I just wanted to clarify something in my comment the other day
> about
> > > > >> DSLs
> > > > >> > since I see that the acronym was also used in Alex's original
> > > document.
> > > > >> > Unless I misunderstood, Alex was proposing to create a DSL for
> > > Brooklyn
> > > > >> > using yaml as syntax and writing a code layer to translate
> between
> > > that
> > > > >> > syntax and underlying APIs which are presumably all in Java.
> > > > >> >
> > > > >> > What I was suggesting was a DSL written directly in  Java (I
> guess)
> > > > >> whose
> > > > >> > syntax would be that language, but whose grammar would be
> keywords
> > > that
> > > > >> > were also Java functions.  Some of these functions would be
> > > pre-defined
> > > > >> in
> > > > >> > the DSL, while others could be  defined by the user and could
> use
> > > other
> > > > >> > functions of the DSL.    The result would be turned into a JAR
> file
> > > (or
> > > > >> > equivalent in another platform)   But during the compile phase,
> it
> > > > >> would be
> > > > >> > checked for errors, and it could be debugged line by line either
> > > > >> invoking
> > > > >> > live functionality or using a library of mock versions of the
> > > Brooklyn
> > > > >> API.
> > > > >> >
> > > > >> > In this 'native' DSL one could provide different types of
> workflow
> > > > >> > constructs as functions (In the BaseClass), taking function
> names as
> > > > >> method
> > > > >> > pointers, or using Lambdas.  It would be a lot easier in Ruby or
> > > Python
> > > > >> >
> > > > >> > // linear
> > > > >> > brooklynRun(NamedTaskMethod, NamedTaskMethod)
> > > > >> >
> > > > >> > // chained
> > > > >> > TaskMethodA()TaskMethodB().
> > > > >> >
> > > > >> > // asynchronous
> > > > >> > brooklynJoin(NamedTaskMethod, NamedTaskMethod,...)
> > > > >> >
> > > > >> > // conditional
> > > > >> > brooklynRunIf(NamedTaskMethod, NamedConditionMethod,...)
> > > > >> >
> > > > >> > // iterative
> > > > >> > brooklynRunWhile(NamedTaskMethod, NamedConditionMethod,...)
> > > > >> > brooklynRunUntil(NamedTaskMethod, NamedConditionMethod,...)
> > > > >> >
> > > > >> > // there could even be a utility to implement legacy syntax
> (this of
> > > > >> course
> > > > >> > would require the extra code layer I was trying to avoid)
> > > > >> > runYaml(Path)
> > > > >> >
> > > > >> > A basic class structure might be
> > > > >> >
> > > > >> > // where BrooklynRecipeBase implements the utility functions
> > > including,
> > > > >> > among others  Join, Run, If, While, Until mentioned above
> > > > >> > // and the BrooklynWorkflowInterface would dictate the
> functional
> > > > >> > requirements for the mandatory aspects of the Recipe.
> > > > >> > class MyRecipe extends BrooklynRecipeBase implements,
> > > > >> > BrooklynWorkflowInterface
> > > > >> > {
> > > > >> > Initialize()
> > > > >> > createContext()   - spin up resources
> > > > >> > workflow() - the main launch sequence using aspects of the DSL
> > > > >> > monitoring() - an asynchronous workflow used to manage sensor
> > > output or
> > > > >> for
> > > > >> > whatever needs to be done while the "orchestra" is plating
> > > > >> > shutdownHook() - called whenever shutdown is happening
> > > > >> > }
> > > > >> >
> > > > >> > For those who don't like the smell of Java, the source file
> could
> > > just
> > > > >> be
> > > > >> > the contents, which would then be injected into the class
> framing
> > > code
> > > > >> > before compilation.
> > > > >> >
> > > > >> > These are just ideas.  I'm not familiar enough with Brooklyn in
> its
> > > > >> current
> > > > >> > implementation to be able to create realistic pseudocode.
> > > > >> >
> > > > >> > Peter
> > > > >> >
> > > > >> > On Thu, Sep 1, 2022 at 9:24 AM Geoff Macartney <
> > > > >> geoff.macartney@gmail.com>
> > > > >> > wrote:
> > > > >> >
> > > > >> > > Hi Alex,
> > > > >> > >
> > > > >> > > That's great, I'll be excited to hear all about it.  7th
> September
> > > > >> > > suits me fine; I would probably prefer 4.00 p.m. over 11.00.
> > > > >> > >
> > > > >> > > Cheers
> > > > >> > > Geoff
> > > > >> > >
> > > > >> > > On Thu, 1 Sept 2022 at 12:41, Alex Heneveld <
> alex@cloudsoft.io>
> > > > >> wrote:
> > > > >> > > >
> > > > >> > > > Thanks for the excellent feedback Geoff and yes there are
> some
> > > very
> > > > >> cool
> > > > >> > > and exciting things added recently -- containers, conditions,
> and
> > > > >> terraform
> > > > >> > > and kubernetes support, all of which make writing complex
> > > blueprints
> > > > >> much
> > > > >> > > easier.
> > > > >> > > >
> > > > >> > > > I'd love to host a session to showcase these.
> > > > >> > > >
> > > > >> > > > How does Wed 7 Sept sound?  I could do 11am UK or 4pm UK --
> > > > >> depending
> > > > >> > > what time suits for people who are interested.  Please RSVP
> and
> > > > >> indicate
> > > > >> > > your time preference!
> > > > >> > > >
> > > > >> > > > Best
> > > > >> > > > Alex
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > On Wed, 31 Aug 2022 at 22:17, Geoff Macartney <
> > > > >> geoff.macartney@gmail.com>
> > > > >> > > wrote:
> > > > >> > > >>
> > > > >> > > >> Hi Alex,
> > > > >> > > >>
> > > > >> > > >> Another thought occurred to me when reading that workflow
> > > > >> proposal. You
> > > > >> > > wrote
> > > > >> > > >>
> > > > >> > > >> "and with the recent support for container-based tasks and
> > > > >> declarative
> > > > >> > > >> conditions, we have taken big steps towards enabling YAML
> > > > >> authorship"
> > > > >> > > >>
> > > > >> > > >> Unfortunately over the past while I haven't been able to
> keep
> > > up as
> > > > >> > > >> closely as I would like with developments in Brooklyn. I'm
> just
> > > > >> > > >> wondering if it might be possible to get together some
> time, on
> > > > >> Google
> > > > >> > > >> Meet or Zoom or whatnot, if you or a colleague could spare
> > > half an
> > > > >> > > >> hour to demo some of these recent developments? But don't
> worry
> > > > >> about
> > > > >> > > >> it if you're too busy at present.
> > > > >> > > >>
> > > > >> > > >> Adding dev@ to this in CC for the sake of Openness. Others
> > > might
> > > > >> also
> > > > >> > > >> be interested!
> > > > >> > > >>
> > > > >> > > >> Cheers
> > > > >> > > >> Geoff
> > > > >> > >
> > > > >>
> > > > >
> > >
>

Re: Declarative Workflow update & shorthand/DSL

Posted by Geoff Macartney <ge...@apache.org>.

Hi Alex,

+1 This updated proposal looks good - I do think the list based
approach will be simpler and less error prone, and the fact that you
will support an optional `id` anyway, if that is desired, means it
retains much of the flexibility of the map based approach. The custom
workflow step looks a little like the "functions" that we discussed
previously. Putting this all together will be pretty powerful.

Will try to get a look at the latest PR if I can.

Cheers
Geoff


On Mon, 19 Sept 2022 at 17:31, Alex Heneveld <al...@cloudsoft.io> wrote:
>
> Geoff-  Thanks.  Comments addressed in #1361 along with a major addition to
> support variables -- inputs/outputs/etc.
>
> All-  One of the points Geoff makes concerns how steps are defined.  I
> think along with other comments that tips the balance in favour of
> revisiting how steps are defined.
>
> I propose we switch from the OLD proposed approach -- the map of ordered
> IDs -- to a NEW LIST-BASED approach.  There's a lot of detail below but
> in-short it's shifting from:
>
> steps:
>   1-say-hi:  log hi
>   2-step-two:  log step 2
>
> To:
>
> steps:
>   - log hi
>   - log step 2
>
>
> Specifically, based on feedback and more hands-on experience, I propose:
>
> * steps now be supplied as a list (now a map)
> * users are no longer required to supply an ID for each step (in the old
> approach, the ID was required as the key for every step)
> * users can if they wish supply an ID for any step (now as an explicit `id:
> <ID>` rule)
> * the default order, if no `next: <ID>` instruction is supplied, is the
> order of the list (in the old approach the order was based on the ID)
>
> Also, the shorthand idea has evolved a little bit; instead of a "<type>:
> <type-specific-shorthand-template>" single-key map, we've suggested:
>
> * it be a string "<type> <type-specific-shorthand-template>"
> * shorthand can also be supplied in a map using the key "s" or the key
> "shorthand" (to allow shorthand along with other step key values)
> * custom steps can define custom shorthand templates (e.g. "${key} "="
> ${value}")
> * (there is also some evolution in how custom steps are defined)
>
>
> To illustrate:
>
> The OLD EXAMPLE:
>
> steps:
>    1:
>       type: container
>       image: my/google-cloud
>       command: gcloud dataproc jobs submit spark --BUCKET=gs://${BUCKET}
>       env:
>         BUCKET: $brooklyn:config("bucket")
>       on-error: retry
>     2:
>       set-sensor: spark-output=${1.stdout}
>
> Would become in the NEW proposal:
>
> steps:
>     - type: container
>       image: my/google-cloud
>       command: gcloud dataproc jobs submit spark --BUCKET=gs://${BUCKET}
>       env:
>         BUCKET: $brooklyn:config("bucket")
>       on-error: retry
>     - set-sensor spark-output = ${1.stdout}
>
> If we wanted to attach an `id` to the second step (e.g. for use with
> "next") we could write it either as:
>
>     # full long-hand map
>     - type: set-sensor
>       input:
>         sensor: spark-output
>         value: ${1.stdout}
>       id: set-spark-output
>
>     # mixed "s" shorthand key and other fields
>     - s: set-sensor spark-output = ${1.stdout}
>       id: set-spark-output
>
> To explain the reasoning:
>
> The advantages of steps:
>
> * Slightly less verbose when no ID is needed on a step
> * Easier to read and understand flow
> * Avoids hassle of renumbering when introducing step
> * Avoids risk of error where same key defined multiple time
>
> The advantages of OLD map-based scheme (implied disadvantages of the new
> steps process):
>
> * Easier user-facing correlation on steps (e.g. in UI) by always having an
> explicit ID for easier correlation
> * Easier to extend a workflow by inserting or overriding explicit steps
>
> After some initial usage of the workflow, it seems these advantages of the
> old approach are outweighed by the advantages of the list approach.  In
> particular the "correlation" can be done in other ways, and extending a
> workflow is probably not so useful, whereas supplying and maintaining an ID
> is a hassle, error-prone, and harder to understand.
>
> Finally to explain the custom steps idea, it works out nicely in the code
> and we think for users to add a "compound-step" to the catalog e.g. as
> follows for the workflow shown above:
>
>   id: retryable-gcloud-dataproc-with-bucket-and-sensor
>   item:
>     type: custom-workflow-step
>     parameters:
>       bucket:
>         type: string
>       sensor_name:
>         type: string
>         default: spark-output
>     shorthand_definition: [ " bucket " ${bucket} ] [ " sensor "
> ${sensor_name} ]
>     steps:
>     - type: container
>       image: my/google-cloud
>       command: gcloud dataproc jobs submit spark --BUCKET=gs://${BUCKET}
>       env:
>         BUCKET: ${bucket}
>       on-error: retry
>     - set-sensor ${sensor_name} = ${1.stdout}
>
> A user could then write a step:
>
> - retryable-gcloud-dataproc-with-bucket-and-sensor
>
> And optionally use the shorthand per the shorthand_definition, matching the
> quoted string literals and inferring the indicated parameters, e.g.:
>
> - retryable-gcloud-dataproc-with-bucket-and-sensor bucket my-bucket sensor
> my-spark-output
>
> They could of course also use the longhand:
>
> - type: retryable-gcloud-dataproc-with-bucket-and-sensor
>   input:
>     bucket: my-bucket
>     sensor_name: my-spark-output
>
>
> Best
> Alex
>
>
>
> On Sat, 17 Sept 2022 at 21:13, Geoff Macartney <ge...@apache.org> wrote:
>
> > Hi Alex,
> >
> > Belatedly reviewed the PR. It's looking good! And surprisingly simple
> > in the end. Made a couple of minor comments on it.
> >
> > Cheers
> > Geoff
> >
> > On Thu, 8 Sept 2022 at 09:35, Alex Heneveld <al...@cloudsoft.io> wrote:
> > >
> > > Hi team,
> > >
> > > An initial PR with a few types and the ability to define an effector is
> > > available [1].
> > >
> > > This is enough for the next steps to be parallelized, e.g. new steps
> > > added.  The proposal has been updated with a work plan / list of tasks
> > > [2].  Any volunteers to help with some of the upcoming tasks let me know.
> > >
> > > Finally I've been thinking about the "shorthand syntax" and how to bring
> > us
> > > closer to Peter's proposal of a DSL.  The original proposal allowed
> > instead
> > > of a map e.g.
> > >
> > > step_sleep:
> > >   type: sleep
> > >   duration: 5s
> > >
> > > or
> > >
> > > step_update_service_up:
> > >   type: set-sensor
> > >   sensor:
> > >     name: service.isUp
> > >     type: boolean
> > >   value: true
> > >
> > > being able to use a shorthand _map_ with a single key being the type, and
> > > value interpreted by that type, so in the OLD SHORTHAND PROPOSAL the
> > above
> > > could be written:
> > >
> > > step_sleep:
> > >   sleep: 5s
> > >
> > > step_update_service_up:
> > >   set-sensor: service.isUp = true
> > >
> > > Having played with syntaxes a bit I wonder if we should instead say the
> > > shorthand DSL kicks in when the step _body_ is a string (instead of a
> > > single-key map), and the first word of the string being the type, and the
> > > remainder interpreted by the type, and we allow it to be a bit more
> > > ambitious.
> > >
> > > Concretely this NEW SHORTHAND PROPOSAL would look something like:
> > >
> > > step_sleep: sleep 5s
> > > step_update_service_up: set-sensor service.isUp = true
> > > # also supporting a type, ie `set-sensor [TYPE] NAME = VALUE`, eg
> > > step_update_service_up: set-sensor boolean service.isUp = true
> > >
> > > You would still need the full map syntax whenever defining flow logic --
> > eg
> > > condition, next, retry, or timeout -- or any property not supported by
> > the
> > > shorthand syntax.  But for the (majority?) simple cases the expression
> > > would be very concise.  In most cases I think it would feel like a DSL
> > but
> > > has the virtue of a very clear translation to the actual workflow model
> > and
> > > the underlying (YAML) model needed for resumption and UI.
> > >
> > > As a final example, the example used at the start of the proposal
> > > (simplified a little -- removing on-error retry and env map as those
> > > wouldn't be supported by shorthand):
> > >
> > > brooklyn.initializers:
> > > - type: workflow-effector
> > >  name: run-spark-on-gcp
> > >  steps:
> > >    1:
> > >       type: container
> > >       image: my/google-cloud
> > >       command: gcloud dataproc jobs submit spark
> > > --BUCKET=gs://$brooklyn:config("bucket")
> > >     2:
> > >       type: set-sensor
> > >       sensor: spark-output
> > >       value: ${1.stdout}
> > >
> > > Could be written in this shorthand as follows:
> > >
> > >  steps:
> > >    1: container my/google-cloud command "gcloud dataproc jobs submit
> > spark
> > > --BUCKET=gs://${entity.config.bucket}"
> > >    2: set-sensor spark-output ${1.stdout}
> > >
> > > Thoughts?
> > >
> > > Best
> > > Alex
> > >
> > >
> > > [1] https://github.com/apache/brooklyn-server/pull/1358
> > > [2]
> > >
> > https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.gbadaqa2yql6
> > >
> > >
> > > On Wed, 7 Sept 2022 at 09:58, Alex Heneveld <al...@cloudsoft.io> wrote:
> > >
> > > > Hi Peter,
> > > >
> > > > Yes - thanks for the extra details.  I did take your suggestion to be a
> > > > procedural DSL not YAML, per the illustration at [1] (second code
> > block).
> > > > Probably where I was confusing was in saying that unlike DSLs which
> > just
> > > > run (and where the execution can be delegated to eg java/groovy/ruby),
> > here
> > > > we need to understand and display, store and resume the workflow
> > progress.
> > > > So I think it needs to be compiled to some representation that is well
> > > > described and that new Apache Brooklyn code can reason about, both in
> > the
> > > > UI (JS) and backend (Java).  Parsing a DSL is much harder than using
> > YAML
> > > > for this "reasonable" representation (as in we can reason _about_ it
> > :) ),
> > > > because we already have good backend processing, persistence,
> > > > serialization; and frontend processing and visualization support for
> > > > YAML-based models.  So I think we almost definitely want a
> > well-described
> > > > declarative YAML model of the workflow.
> > > >
> > > > We might *also* want a Workflow DSL because I agree with you a DSL
> > would
> > > > be nicer for a user to write (if writing by hand; although if composing
> > > > visually a drag-and-drop to YAML is probably easier).  However it
> > should
> > > > probably get "compiled" into a Workflow YAML.  So I'm suggesting we do
> > the
> > > > workflow YAML at this stage, and a DSL that compiles into that YAML
> > can be
> > > > designed later.  (Designing a good DSL and parser and reason-about-able
> > > > representation is a big task, so being able to separate it feels good
> > too!)
> > > >
> > > > Best
> > > > Alex
> > > >
> > > > [1]
> > > >
> > https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.75wm48pjvx0h
> > > >
> > > >
> > > > On Fri, 2 Sept 2022 at 20:17, Geoff Macartney <
> > geoff.macartney@gmail.com>
> > > > wrote:
> > > >
> > > >> Hi Peter,
> > > >>
> > > >> Thanks for such a detailed writeup of how you see this working. I fear
> > > >> I've too little experience with this sort of thing to be able to say
> > > >> anything very useful about it. My thought on the matter would be,
> > > >> let's get started with the yaml based approach and see how it goes. I
> > > >> think that experience would then give us a much better feel for what a
> > > >> really nice and usable DSL for workflows would look like (probably to
> > > >> address all the pain points of the yaml approach! :-)   The outline
> > > >> above will then be a good starting point, I'm sure.
> > > >>
> > > >> Cheers
> > > >> Geoff
> > > >>
> > > >> On Thu, 1 Sept 2022 at 21:26, Peter Abramowitsch
> > > >> <pa...@gmail.com> wrote:
> > > >> >
> > > >> > Hi All
> > > >> > I just wanted to clarify something in my comment the other day about
> > > >> DSLs
> > > >> > since I see that the acronym was also used in Alex's original
> > document.
> > > >> > Unless I misunderstood, Alex was proposing to create a DSL for
> > Brooklyn
> > > >> > using yaml as syntax and writing a code layer to translate between
> > that
> > > >> > syntax and underlying APIs which are presumably all in Java.
> > > >> >
> > > >> > What I was suggesting was a DSL written directly in  Java (I guess)
> > > >> whose
> > > >> > syntax would be that language, but whose grammar would be keywords
> > that
> > > >> > were also Java functions.  Some of these functions would be
> > pre-defined
> > > >> in
> > > >> > the DSL, while others could be  defined by the user and could use
> > other
> > > >> > functions of the DSL.    The result would be turned into a JAR file
> > (or
> > > >> > equivalent in another platform)   But during the compile phase, it
> > > >> would be
> > > >> > checked for errors, and it could be debugged line by line either
> > > >> invoking
> > > >> > live functionality or using a library of mock versions of the
> > Brooklyn
> > > >> API.
> > > >> >
> > > >> > In this 'native' DSL one could provide different types of workflow
> > > >> > constructs as functions (In the BaseClass), taking function names as
> > > >> method
> > > >> > pointers, or using Lambdas.  It would be a lot easier in Ruby or
> > Python
> > > >> >
> > > >> > // linear
> > > >> > brooklynRun(NamedTaskMethod, NamedTaskMethod)
> > > >> >
> > > >> > // chained
> > > >> > TaskMethodA()TaskMethodB().
> > > >> >
> > > >> > // asynchronous
> > > >> > brooklynJoin(NamedTaskMethod, NamedTaskMethod,...)
> > > >> >
> > > >> > // conditional
> > > >> > brooklynRunIf(NamedTaskMethod, NamedConditionMethod,...)
> > > >> >
> > > >> > // iterative
> > > >> > brooklynRunWhile(NamedTaskMethod, NamedConditionMethod,...)
> > > >> > brooklynRunUntil(NamedTaskMethod, NamedConditionMethod,...)
> > > >> >
> > > >> > // there could even be a utility to implement legacy syntax (this of
> > > >> course
> > > >> > would require the extra code layer I was trying to avoid)
> > > >> > runYaml(Path)
> > > >> >
> > > >> > A basic class structure might be
> > > >> >
> > > >> > // where BrooklynRecipeBase implements the utility functions
> > including,
> > > >> > among others  Join, Run, If, While, Until mentioned above
> > > >> > // and the BrooklynWorkflowInterface would dictate the functional
> > > >> > requirements for the mandatory aspects of the Recipe.
> > > >> > class MyRecipe extends BrooklynRecipeBase implements,
> > > >> > BrooklynWorkflowInterface
> > > >> > {
> > > >> > Initialize()
> > > >> > createContext()   - spin up resources
> > > >> > workflow() - the main launch sequence using aspects of the DSL
> > > >> > monitoring() - an asynchronous workflow used to manage sensor
> > output or
> > > >> for
> > > >> > whatever needs to be done while the "orchestra" is plating
> > > >> > shutdownHook() - called whenever shutdown is happening
> > > >> > }
> > > >> >
> > > >> > For those who don't like the smell of Java, the source file could
> > just
> > > >> be
> > > >> > the contents, which would then be injected into the class framing
> > code
> > > >> > before compilation.
> > > >> >
> > > >> > These are just ideas.  I'm not familiar enough with Brooklyn in its
> > > >> current
> > > >> > implementation to be able to create realistic pseudocode.
> > > >> >
> > > >> > Peter
> > > >> >
> > > >> > On Thu, Sep 1, 2022 at 9:24 AM Geoff Macartney <
> > > >> geoff.macartney@gmail.com>
> > > >> > wrote:
> > > >> >
> > > >> > > Hi Alex,
> > > >> > >
> > > >> > > That's great, I'll be excited to hear all about it.  7th September
> > > >> > > suits me fine; I would probably prefer 4.00 p.m. over 11.00.
> > > >> > >
> > > >> > > Cheers
> > > >> > > Geoff
> > > >> > >
> > > >> > > On Thu, 1 Sept 2022 at 12:41, Alex Heneveld <al...@cloudsoft.io>
> > > >> wrote:
> > > >> > > >
> > > >> > > > Thanks for the excellent feedback Geoff and yes there are some
> > very
> > > >> cool
> > > >> > > and exciting things added recently -- containers, conditions, and
> > > >> terraform
> > > >> > > and kubernetes support, all of which make writing complex
> > blueprints
> > > >> much
> > > >> > > easier.
> > > >> > > >
> > > >> > > > I'd love to host a session to showcase these.
> > > >> > > >
> > > >> > > > How does Wed 7 Sept sound?  I could do 11am UK or 4pm UK --
> > > >> depending
> > > >> > > what time suits for people who are interested.  Please RSVP and
> > > >> indicate
> > > >> > > your time preference!
> > > >> > > >
> > > >> > > > Best
> > > >> > > > Alex
> > > >> > > >
> > > >> > > >
> > > >> > > > On Wed, 31 Aug 2022 at 22:17, Geoff Macartney <
> > > >> geoff.macartney@gmail.com>
> > > >> > > wrote:
> > > >> > > >>
> > > >> > > >> Hi Alex,
> > > >> > > >>
> > > >> > > >> Another thought occurred to me when reading that workflow
> > > >> proposal. You
> > > >> > > wrote
> > > >> > > >>
> > > >> > > >> "and with the recent support for container-based tasks and
> > > >> declarative
> > > >> > > >> conditions, we have taken big steps towards enabling YAML
> > > >> authorship"
> > > >> > > >>
> > > >> > > >> Unfortunately over the past while I haven't been able to keep
> > up as
> > > >> > > >> closely as I would like with developments in Brooklyn. I'm just
> > > >> > > >> wondering if it might be possible to get together some time, on
> > > >> Google
> > > >> > > >> Meet or Zoom or whatnot, if you or a colleague could spare
> > half an
> > > >> > > >> hour to demo some of these recent developments? But don't worry
> > > >> about
> > > >> > > >> it if you're too busy at present.
> > > >> > > >>
> > > >> > > >> Adding dev@ to this in CC for the sake of Openness. Others
> > might
> > > >> also
> > > >> > > >> be interested!
> > > >> > > >>
> > > >> > > >> Cheers
> > > >> > > >> Geoff
> > > >> > >
> > > >>
> > > >
> >

Re: Declarative Workflow update & shorthand/DSL

Posted by Alex Heneveld <al...@cloudsoft.io>.

Geoff-  Thanks.  Comments addressed in #1361 along with a major addition to
support variables -- inputs/outputs/etc.

All-  One of the points Geoff makes concerns how steps are defined.  I
think along with other comments that tips the balance in favour of
revisiting how steps are defined.

I propose we switch from the OLD proposed approach -- the map of ordered
IDs -- to a NEW LIST-BASED approach.  There's a lot of detail below but
in-short it's shifting from:

steps:
  1-say-hi:  log hi
  2-step-two:  log step 2

To:

steps:
  - log hi
  - log step 2


Specifically, based on feedback and more hands-on experience, I propose:

* steps now be supplied as a list (now a map)
* users are no longer required to supply an ID for each step (in the old
approach, the ID was required as the key for every step)
* users can if they wish supply an ID for any step (now as an explicit `id:
<ID>` rule)
* the default order, if no `next: <ID>` instruction is supplied, is the
order of the list (in the old approach the order was based on the ID)

Also, the shorthand idea has evolved a little bit; instead of a "<type>:
<type-specific-shorthand-template>" single-key map, we've suggested:

* it be a string "<type> <type-specific-shorthand-template>"
* shorthand can also be supplied in a map using the key "s" or the key
"shorthand" (to allow shorthand along with other step key values)
* custom steps can define custom shorthand templates (e.g. "${key} "="
${value}")
* (there is also some evolution in how custom steps are defined)


To illustrate:

The OLD EXAMPLE:

steps:
   1:
      type: container
      image: my/google-cloud
      command: gcloud dataproc jobs submit spark --BUCKET=gs://${BUCKET}
      env:
        BUCKET: $brooklyn:config("bucket")
      on-error: retry
    2:
      set-sensor: spark-output=${1.stdout}

Would become in the NEW proposal:

steps:
    - type: container
      image: my/google-cloud
      command: gcloud dataproc jobs submit spark --BUCKET=gs://${BUCKET}
      env:
        BUCKET: $brooklyn:config("bucket")
      on-error: retry
    - set-sensor spark-output = ${1.stdout}

If we wanted to attach an `id` to the second step (e.g. for use with
"next") we could write it either as:

    # full long-hand map
    - type: set-sensor
      input:
        sensor: spark-output
        value: ${1.stdout}
      id: set-spark-output

    # mixed "s" shorthand key and other fields
    - s: set-sensor spark-output = ${1.stdout}
      id: set-spark-output

To explain the reasoning:

The advantages of steps:

* Slightly less verbose when no ID is needed on a step
* Easier to read and understand flow
* Avoids hassle of renumbering when introducing step
* Avoids risk of error where same key defined multiple time

The advantages of OLD map-based scheme (implied disadvantages of the new
steps process):

* Easier user-facing correlation on steps (e.g. in UI) by always having an
explicit ID for easier correlation
* Easier to extend a workflow by inserting or overriding explicit steps

After some initial usage of the workflow, it seems these advantages of the
old approach are outweighed by the advantages of the list approach.  In
particular the "correlation" can be done in other ways, and extending a
workflow is probably not so useful, whereas supplying and maintaining an ID
is a hassle, error-prone, and harder to understand.

Finally to explain the custom steps idea, it works out nicely in the code
and we think for users to add a "compound-step" to the catalog e.g. as
follows for the workflow shown above:

  id: retryable-gcloud-dataproc-with-bucket-and-sensor
  item:
    type: custom-workflow-step
    parameters:
      bucket:
        type: string
      sensor_name:
        type: string
        default: spark-output
    shorthand_definition: [ " bucket " ${bucket} ] [ " sensor "
${sensor_name} ]
    steps:
    - type: container
      image: my/google-cloud
      command: gcloud dataproc jobs submit spark --BUCKET=gs://${BUCKET}
      env:
        BUCKET: ${bucket}
      on-error: retry
    - set-sensor ${sensor_name} = ${1.stdout}

A user could then write a step:

- retryable-gcloud-dataproc-with-bucket-and-sensor

And optionally use the shorthand per the shorthand_definition, matching the
quoted string literals and inferring the indicated parameters, e.g.:

- retryable-gcloud-dataproc-with-bucket-and-sensor bucket my-bucket sensor
my-spark-output

They could of course also use the longhand:

- type: retryable-gcloud-dataproc-with-bucket-and-sensor
  input:
    bucket: my-bucket
    sensor_name: my-spark-output


Best
Alex



On Sat, 17 Sept 2022 at 21:13, Geoff Macartney <ge...@apache.org> wrote:

> Hi Alex,
>
> Belatedly reviewed the PR. It's looking good! And surprisingly simple
> in the end. Made a couple of minor comments on it.
>
> Cheers
> Geoff
>
> On Thu, 8 Sept 2022 at 09:35, Alex Heneveld <al...@cloudsoft.io> wrote:
> >
> > Hi team,
> >
> > An initial PR with a few types and the ability to define an effector is
> > available [1].
> >
> > This is enough for the next steps to be parallelized, e.g. new steps
> > added.  The proposal has been updated with a work plan / list of tasks
> > [2].  Any volunteers to help with some of the upcoming tasks let me know.
> >
> > Finally I've been thinking about the "shorthand syntax" and how to bring
> us
> > closer to Peter's proposal of a DSL.  The original proposal allowed
> instead
> > of a map e.g.
> >
> > step_sleep:
> >   type: sleep
> >   duration: 5s
> >
> > or
> >
> > step_update_service_up:
> >   type: set-sensor
> >   sensor:
> >     name: service.isUp
> >     type: boolean
> >   value: true
> >
> > being able to use a shorthand _map_ with a single key being the type, and
> > value interpreted by that type, so in the OLD SHORTHAND PROPOSAL the
> above
> > could be written:
> >
> > step_sleep:
> >   sleep: 5s
> >
> > step_update_service_up:
> >   set-sensor: service.isUp = true
> >
> > Having played with syntaxes a bit I wonder if we should instead say the
> > shorthand DSL kicks in when the step _body_ is a string (instead of a
> > single-key map), and the first word of the string being the type, and the
> > remainder interpreted by the type, and we allow it to be a bit more
> > ambitious.
> >
> > Concretely this NEW SHORTHAND PROPOSAL would look something like:
> >
> > step_sleep: sleep 5s
> > step_update_service_up: set-sensor service.isUp = true
> > # also supporting a type, ie `set-sensor [TYPE] NAME = VALUE`, eg
> > step_update_service_up: set-sensor boolean service.isUp = true
> >
> > You would still need the full map syntax whenever defining flow logic --
> eg
> > condition, next, retry, or timeout -- or any property not supported by
> the
> > shorthand syntax.  But for the (majority?) simple cases the expression
> > would be very concise.  In most cases I think it would feel like a DSL
> but
> > has the virtue of a very clear translation to the actual workflow model
> and
> > the underlying (YAML) model needed for resumption and UI.
> >
> > As a final example, the example used at the start of the proposal
> > (simplified a little -- removing on-error retry and env map as those
> > wouldn't be supported by shorthand):
> >
> > brooklyn.initializers:
> > - type: workflow-effector
> >  name: run-spark-on-gcp
> >  steps:
> >    1:
> >       type: container
> >       image: my/google-cloud
> >       command: gcloud dataproc jobs submit spark
> > --BUCKET=gs://$brooklyn:config("bucket")
> >     2:
> >       type: set-sensor
> >       sensor: spark-output
> >       value: ${1.stdout}
> >
> > Could be written in this shorthand as follows:
> >
> >  steps:
> >    1: container my/google-cloud command "gcloud dataproc jobs submit
> spark
> > --BUCKET=gs://${entity.config.bucket}"
> >    2: set-sensor spark-output ${1.stdout}
> >
> > Thoughts?
> >
> > Best
> > Alex
> >
> >
> > [1] https://github.com/apache/brooklyn-server/pull/1358
> > [2]
> >
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.gbadaqa2yql6
> >
> >
> > On Wed, 7 Sept 2022 at 09:58, Alex Heneveld <al...@cloudsoft.io> wrote:
> >
> > > Hi Peter,
> > >
> > > Yes - thanks for the extra details.  I did take your suggestion to be a
> > > procedural DSL not YAML, per the illustration at [1] (second code
> block).
> > > Probably where I was confusing was in saying that unlike DSLs which
> just
> > > run (and where the execution can be delegated to eg java/groovy/ruby),
> here
> > > we need to understand and display, store and resume the workflow
> progress.
> > > So I think it needs to be compiled to some representation that is well
> > > described and that new Apache Brooklyn code can reason about, both in
> the
> > > UI (JS) and backend (Java).  Parsing a DSL is much harder than using
> YAML
> > > for this "reasonable" representation (as in we can reason _about_ it
> :) ),
> > > because we already have good backend processing, persistence,
> > > serialization; and frontend processing and visualization support for
> > > YAML-based models.  So I think we almost definitely want a
> well-described
> > > declarative YAML model of the workflow.
> > >
> > > We might *also* want a Workflow DSL because I agree with you a DSL
> would
> > > be nicer for a user to write (if writing by hand; although if composing
> > > visually a drag-and-drop to YAML is probably easier).  However it
> should
> > > probably get "compiled" into a Workflow YAML.  So I'm suggesting we do
> the
> > > workflow YAML at this stage, and a DSL that compiles into that YAML
> can be
> > > designed later.  (Designing a good DSL and parser and reason-about-able
> > > representation is a big task, so being able to separate it feels good
> too!)
> > >
> > > Best
> > > Alex
> > >
> > > [1]
> > >
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.75wm48pjvx0h
> > >
> > >
> > > On Fri, 2 Sept 2022 at 20:17, Geoff Macartney <
> geoff.macartney@gmail.com>
> > > wrote:
> > >
> > >> Hi Peter,
> > >>
> > >> Thanks for such a detailed writeup of how you see this working. I fear
> > >> I've too little experience with this sort of thing to be able to say
> > >> anything very useful about it. My thought on the matter would be,
> > >> let's get started with the yaml based approach and see how it goes. I
> > >> think that experience would then give us a much better feel for what a
> > >> really nice and usable DSL for workflows would look like (probably to
> > >> address all the pain points of the yaml approach! :-)   The outline
> > >> above will then be a good starting point, I'm sure.
> > >>
> > >> Cheers
> > >> Geoff
> > >>
> > >> On Thu, 1 Sept 2022 at 21:26, Peter Abramowitsch
> > >> <pa...@gmail.com> wrote:
> > >> >
> > >> > Hi All
> > >> > I just wanted to clarify something in my comment the other day about
> > >> DSLs
> > >> > since I see that the acronym was also used in Alex's original
> document.
> > >> > Unless I misunderstood, Alex was proposing to create a DSL for
> Brooklyn
> > >> > using yaml as syntax and writing a code layer to translate between
> that
> > >> > syntax and underlying APIs which are presumably all in Java.
> > >> >
> > >> > What I was suggesting was a DSL written directly in  Java (I guess)
> > >> whose
> > >> > syntax would be that language, but whose grammar would be keywords
> that
> > >> > were also Java functions.  Some of these functions would be
> pre-defined
> > >> in
> > >> > the DSL, while others could be  defined by the user and could use
> other
> > >> > functions of the DSL.    The result would be turned into a JAR file
> (or
> > >> > equivalent in another platform)   But during the compile phase, it
> > >> would be
> > >> > checked for errors, and it could be debugged line by line either
> > >> invoking
> > >> > live functionality or using a library of mock versions of the
> Brooklyn
> > >> API.
> > >> >
> > >> > In this 'native' DSL one could provide different types of workflow
> > >> > constructs as functions (In the BaseClass), taking function names as
> > >> method
> > >> > pointers, or using Lambdas.  It would be a lot easier in Ruby or
> Python
> > >> >
> > >> > // linear
> > >> > brooklynRun(NamedTaskMethod, NamedTaskMethod)
> > >> >
> > >> > // chained
> > >> > TaskMethodA()TaskMethodB().
> > >> >
> > >> > // asynchronous
> > >> > brooklynJoin(NamedTaskMethod, NamedTaskMethod,...)
> > >> >
> > >> > // conditional
> > >> > brooklynRunIf(NamedTaskMethod, NamedConditionMethod,...)
> > >> >
> > >> > // iterative
> > >> > brooklynRunWhile(NamedTaskMethod, NamedConditionMethod,...)
> > >> > brooklynRunUntil(NamedTaskMethod, NamedConditionMethod,...)
> > >> >
> > >> > // there could even be a utility to implement legacy syntax (this of
> > >> course
> > >> > would require the extra code layer I was trying to avoid)
> > >> > runYaml(Path)
> > >> >
> > >> > A basic class structure might be
> > >> >
> > >> > // where BrooklynRecipeBase implements the utility functions
> including,
> > >> > among others  Join, Run, If, While, Until mentioned above
> > >> > // and the BrooklynWorkflowInterface would dictate the functional
> > >> > requirements for the mandatory aspects of the Recipe.
> > >> > class MyRecipe extends BrooklynRecipeBase implements,
> > >> > BrooklynWorkflowInterface
> > >> > {
> > >> > Initialize()
> > >> > createContext()   - spin up resources
> > >> > workflow() - the main launch sequence using aspects of the DSL
> > >> > monitoring() - an asynchronous workflow used to manage sensor
> output or
> > >> for
> > >> > whatever needs to be done while the "orchestra" is plating
> > >> > shutdownHook() - called whenever shutdown is happening
> > >> > }
> > >> >
> > >> > For those who don't like the smell of Java, the source file could
> just
> > >> be
> > >> > the contents, which would then be injected into the class framing
> code
> > >> > before compilation.
> > >> >
> > >> > These are just ideas.  I'm not familiar enough with Brooklyn in its
> > >> current
> > >> > implementation to be able to create realistic pseudocode.
> > >> >
> > >> > Peter
> > >> >
> > >> > On Thu, Sep 1, 2022 at 9:24 AM Geoff Macartney <
> > >> geoff.macartney@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > Hi Alex,
> > >> > >
> > >> > > That's great, I'll be excited to hear all about it.  7th September
> > >> > > suits me fine; I would probably prefer 4.00 p.m. over 11.00.
> > >> > >
> > >> > > Cheers
> > >> > > Geoff
> > >> > >
> > >> > > On Thu, 1 Sept 2022 at 12:41, Alex Heneveld <al...@cloudsoft.io>
> > >> wrote:
> > >> > > >
> > >> > > > Thanks for the excellent feedback Geoff and yes there are some
> very
> > >> cool
> > >> > > and exciting things added recently -- containers, conditions, and
> > >> terraform
> > >> > > and kubernetes support, all of which make writing complex
> blueprints
> > >> much
> > >> > > easier.
> > >> > > >
> > >> > > > I'd love to host a session to showcase these.
> > >> > > >
> > >> > > > How does Wed 7 Sept sound?  I could do 11am UK or 4pm UK --
> > >> depending
> > >> > > what time suits for people who are interested.  Please RSVP and
> > >> indicate
> > >> > > your time preference!
> > >> > > >
> > >> > > > Best
> > >> > > > Alex
> > >> > > >
> > >> > > >
> > >> > > > On Wed, 31 Aug 2022 at 22:17, Geoff Macartney <
> > >> geoff.macartney@gmail.com>
> > >> > > wrote:
> > >> > > >>
> > >> > > >> Hi Alex,
> > >> > > >>
> > >> > > >> Another thought occurred to me when reading that workflow
> > >> proposal. You
> > >> > > wrote
> > >> > > >>
> > >> > > >> "and with the recent support for container-based tasks and
> > >> declarative
> > >> > > >> conditions, we have taken big steps towards enabling YAML
> > >> authorship"
> > >> > > >>
> > >> > > >> Unfortunately over the past while I haven't been able to keep
> up as
> > >> > > >> closely as I would like with developments in Brooklyn. I'm just
> > >> > > >> wondering if it might be possible to get together some time, on
> > >> Google
> > >> > > >> Meet or Zoom or whatnot, if you or a colleague could spare
> half an
> > >> > > >> hour to demo some of these recent developments? But don't worry
> > >> about
> > >> > > >> it if you're too busy at present.
> > >> > > >>
> > >> > > >> Adding dev@ to this in CC for the sake of Openness. Others
> might
> > >> also
> > >> > > >> be interested!
> > >> > > >>
> > >> > > >> Cheers
> > >> > > >> Geoff
> > >> > >
> > >>
> > >
>

Re: Declarative Workflow update & shorthand/DSL

Posted by Geoff Macartney <ge...@apache.org>.

Hi Alex,

Belatedly reviewed the PR. It's looking good! And surprisingly simple
in the end. Made a couple of minor comments on it.

Cheers
Geoff

On Thu, 8 Sept 2022 at 09:35, Alex Heneveld <al...@cloudsoft.io> wrote:
>
> Hi team,
>
> An initial PR with a few types and the ability to define an effector is
> available [1].
>
> This is enough for the next steps to be parallelized, e.g. new steps
> added.  The proposal has been updated with a work plan / list of tasks
> [2].  Any volunteers to help with some of the upcoming tasks let me know.
>
> Finally I've been thinking about the "shorthand syntax" and how to bring us
> closer to Peter's proposal of a DSL.  The original proposal allowed instead
> of a map e.g.
>
> step_sleep:
>   type: sleep
>   duration: 5s
>
> or
>
> step_update_service_up:
>   type: set-sensor
>   sensor:
>     name: service.isUp
>     type: boolean
>   value: true
>
> being able to use a shorthand _map_ with a single key being the type, and
> value interpreted by that type, so in the OLD SHORTHAND PROPOSAL the above
> could be written:
>
> step_sleep:
>   sleep: 5s
>
> step_update_service_up:
>   set-sensor: service.isUp = true
>
> Having played with syntaxes a bit I wonder if we should instead say the
> shorthand DSL kicks in when the step _body_ is a string (instead of a
> single-key map), and the first word of the string being the type, and the
> remainder interpreted by the type, and we allow it to be a bit more
> ambitious.
>
> Concretely this NEW SHORTHAND PROPOSAL would look something like:
>
> step_sleep: sleep 5s
> step_update_service_up: set-sensor service.isUp = true
> # also supporting a type, ie `set-sensor [TYPE] NAME = VALUE`, eg
> step_update_service_up: set-sensor boolean service.isUp = true
>
> You would still need the full map syntax whenever defining flow logic -- eg
> condition, next, retry, or timeout -- or any property not supported by the
> shorthand syntax.  But for the (majority?) simple cases the expression
> would be very concise.  In most cases I think it would feel like a DSL but
> has the virtue of a very clear translation to the actual workflow model and
> the underlying (YAML) model needed for resumption and UI.
>
> As a final example, the example used at the start of the proposal
> (simplified a little -- removing on-error retry and env map as those
> wouldn't be supported by shorthand):
>
> brooklyn.initializers:
> - type: workflow-effector
>  name: run-spark-on-gcp
>  steps:
>    1:
>       type: container
>       image: my/google-cloud
>       command: gcloud dataproc jobs submit spark
> --BUCKET=gs://$brooklyn:config("bucket")
>     2:
>       type: set-sensor
>       sensor: spark-output
>       value: ${1.stdout}
>
> Could be written in this shorthand as follows:
>
>  steps:
>    1: container my/google-cloud command "gcloud dataproc jobs submit spark
> --BUCKET=gs://${entity.config.bucket}"
>    2: set-sensor spark-output ${1.stdout}
>
> Thoughts?
>
> Best
> Alex
>
>
> [1] https://github.com/apache/brooklyn-server/pull/1358
> [2]
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.gbadaqa2yql6
>
>
> On Wed, 7 Sept 2022 at 09:58, Alex Heneveld <al...@cloudsoft.io> wrote:
>
> > Hi Peter,
> >
> > Yes - thanks for the extra details.  I did take your suggestion to be a
> > procedural DSL not YAML, per the illustration at [1] (second code block).
> > Probably where I was confusing was in saying that unlike DSLs which just
> > run (and where the execution can be delegated to eg java/groovy/ruby), here
> > we need to understand and display, store and resume the workflow progress.
> > So I think it needs to be compiled to some representation that is well
> > described and that new Apache Brooklyn code can reason about, both in the
> > UI (JS) and backend (Java).  Parsing a DSL is much harder than using YAML
> > for this "reasonable" representation (as in we can reason _about_ it :) ),
> > because we already have good backend processing, persistence,
> > serialization; and frontend processing and visualization support for
> > YAML-based models.  So I think we almost definitely want a well-described
> > declarative YAML model of the workflow.
> >
> > We might *also* want a Workflow DSL because I agree with you a DSL would
> > be nicer for a user to write (if writing by hand; although if composing
> > visually a drag-and-drop to YAML is probably easier).  However it should
> > probably get "compiled" into a Workflow YAML.  So I'm suggesting we do the
> > workflow YAML at this stage, and a DSL that compiles into that YAML can be
> > designed later.  (Designing a good DSL and parser and reason-about-able
> > representation is a big task, so being able to separate it feels good too!)
> >
> > Best
> > Alex
> >
> > [1]
> > https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.75wm48pjvx0h
> >
> >
> > On Fri, 2 Sept 2022 at 20:17, Geoff Macartney <ge...@gmail.com>
> > wrote:
> >
> >> Hi Peter,
> >>
> >> Thanks for such a detailed writeup of how you see this working. I fear
> >> I've too little experience with this sort of thing to be able to say
> >> anything very useful about it. My thought on the matter would be,
> >> let's get started with the yaml based approach and see how it goes. I
> >> think that experience would then give us a much better feel for what a
> >> really nice and usable DSL for workflows would look like (probably to
> >> address all the pain points of the yaml approach! :-)   The outline
> >> above will then be a good starting point, I'm sure.
> >>
> >> Cheers
> >> Geoff
> >>
> >> On Thu, 1 Sept 2022 at 21:26, Peter Abramowitsch
> >> <pa...@gmail.com> wrote:
> >> >
> >> > Hi All
> >> > I just wanted to clarify something in my comment the other day about
> >> DSLs
> >> > since I see that the acronym was also used in Alex's original document.
> >> > Unless I misunderstood, Alex was proposing to create a DSL for Brooklyn
> >> > using yaml as syntax and writing a code layer to translate between that
> >> > syntax and underlying APIs which are presumably all in Java.
> >> >
> >> > What I was suggesting was a DSL written directly in  Java (I guess)
> >> whose
> >> > syntax would be that language, but whose grammar would be keywords that
> >> > were also Java functions.  Some of these functions would be pre-defined
> >> in
> >> > the DSL, while others could be  defined by the user and could use other
> >> > functions of the DSL.    The result would be turned into a JAR file (or
> >> > equivalent in another platform)   But during the compile phase, it
> >> would be
> >> > checked for errors, and it could be debugged line by line either
> >> invoking
> >> > live functionality or using a library of mock versions of the Brooklyn
> >> API.
> >> >
> >> > In this 'native' DSL one could provide different types of workflow
> >> > constructs as functions (In the BaseClass), taking function names as
> >> method
> >> > pointers, or using Lambdas.  It would be a lot easier in Ruby or Python
> >> >
> >> > // linear
> >> > brooklynRun(NamedTaskMethod, NamedTaskMethod)
> >> >
> >> > // chained
> >> > TaskMethodA()TaskMethodB().
> >> >
> >> > // asynchronous
> >> > brooklynJoin(NamedTaskMethod, NamedTaskMethod,...)
> >> >
> >> > // conditional
> >> > brooklynRunIf(NamedTaskMethod, NamedConditionMethod,...)
> >> >
> >> > // iterative
> >> > brooklynRunWhile(NamedTaskMethod, NamedConditionMethod,...)
> >> > brooklynRunUntil(NamedTaskMethod, NamedConditionMethod,...)
> >> >
> >> > // there could even be a utility to implement legacy syntax (this of
> >> course
> >> > would require the extra code layer I was trying to avoid)
> >> > runYaml(Path)
> >> >
> >> > A basic class structure might be
> >> >
> >> > // where BrooklynRecipeBase implements the utility functions including,
> >> > among others  Join, Run, If, While, Until mentioned above
> >> > // and the BrooklynWorkflowInterface would dictate the functional
> >> > requirements for the mandatory aspects of the Recipe.
> >> > class MyRecipe extends BrooklynRecipeBase implements,
> >> > BrooklynWorkflowInterface
> >> > {
> >> > Initialize()
> >> > createContext()   - spin up resources
> >> > workflow() - the main launch sequence using aspects of the DSL
> >> > monitoring() - an asynchronous workflow used to manage sensor output or
> >> for
> >> > whatever needs to be done while the "orchestra" is plating
> >> > shutdownHook() - called whenever shutdown is happening
> >> > }
> >> >
> >> > For those who don't like the smell of Java, the source file could just
> >> be
> >> > the contents, which would then be injected into the class framing code
> >> > before compilation.
> >> >
> >> > These are just ideas.  I'm not familiar enough with Brooklyn in its
> >> current
> >> > implementation to be able to create realistic pseudocode.
> >> >
> >> > Peter
> >> >
> >> > On Thu, Sep 1, 2022 at 9:24 AM Geoff Macartney <
> >> geoff.macartney@gmail.com>
> >> > wrote:
> >> >
> >> > > Hi Alex,
> >> > >
> >> > > That's great, I'll be excited to hear all about it.  7th September
> >> > > suits me fine; I would probably prefer 4.00 p.m. over 11.00.
> >> > >
> >> > > Cheers
> >> > > Geoff
> >> > >
> >> > > On Thu, 1 Sept 2022 at 12:41, Alex Heneveld <al...@cloudsoft.io>
> >> wrote:
> >> > > >
> >> > > > Thanks for the excellent feedback Geoff and yes there are some very
> >> cool
> >> > > and exciting things added recently -- containers, conditions, and
> >> terraform
> >> > > and kubernetes support, all of which make writing complex blueprints
> >> much
> >> > > easier.
> >> > > >
> >> > > > I'd love to host a session to showcase these.
> >> > > >
> >> > > > How does Wed 7 Sept sound?  I could do 11am UK or 4pm UK --
> >> depending
> >> > > what time suits for people who are interested.  Please RSVP and
> >> indicate
> >> > > your time preference!
> >> > > >
> >> > > > Best
> >> > > > Alex
> >> > > >
> >> > > >
> >> > > > On Wed, 31 Aug 2022 at 22:17, Geoff Macartney <
> >> geoff.macartney@gmail.com>
> >> > > wrote:
> >> > > >>
> >> > > >> Hi Alex,
> >> > > >>
> >> > > >> Another thought occurred to me when reading that workflow
> >> proposal. You
> >> > > wrote
> >> > > >>
> >> > > >> "and with the recent support for container-based tasks and
> >> declarative
> >> > > >> conditions, we have taken big steps towards enabling YAML
> >> authorship"
> >> > > >>
> >> > > >> Unfortunately over the past while I haven't been able to keep up as
> >> > > >> closely as I would like with developments in Brooklyn. I'm just
> >> > > >> wondering if it might be possible to get together some time, on
> >> Google
> >> > > >> Meet or Zoom or whatnot, if you or a colleague could spare half an
> >> > > >> hour to demo some of these recent developments? But don't worry
> >> about
> >> > > >> it if you're too busy at present.
> >> > > >>
> >> > > >> Adding dev@ to this in CC for the sake of Openness. Others might
> >> also
> >> > > >> be interested!
> >> > > >>
> >> > > >> Cheers
> >> > > >> Geoff
> >> > >
> >>
> >

Re: Declarative Workflow update & shorthand/DSL

Posted by Mykola Mandra <my...@cloudsoftcorp.com>.

Yes, Geoff, I myself have spent a couple of hours reviewing it. I’m am ready to continue developing workflows further, and I would not delay it if there are no strong objections for workflows. It looks like a good addition to the core, moreover, looks like there is a demand for it.

Please leave comments in the PR that is merged and I address them, respectively.

Mykola

> On 12 Sep 2022, at 09:59, Geoff Macartney <ge...@gmail.com> wrote:
> 
> Hi Mykola,
> 
> I'd have thought it would be worth giving it some time to review,
> given that it is such a significant change. I'm personally of the
> opinion that significant changes like this shouldn't be merged without
> review. I was hoping to do some review on it this week myself, though
> I don't really get time to work on Brooklyn these days, so I was also
> hoping that others would review it who are still close to the code.
> Did you get a chance to look through it yourself?
> 
> What do you all think? I guess if it is in place and doesn't break
> anything then it's OK?
> 
> Geoff
> 
> 
> 
> On Mon, 12 Sept 2022 at 09:04, Mykola Mandra
> <my...@cloudsoftcorp.com> wrote:
>> 
>> Hell All,
>> 
>> Was I too hasty to merge the PR - https://github.com/apache/brooklyn-server/pull/1358? <https://github.com/apache/brooklyn-server/pull/1358?>
>> We agreed to split the some of the work with Alex, and I did not see why it could not be merged to continue with what I’m working on.
>> 
>> Regards,
>> Mykola
>> 
>>> On 9 Sep 2022, at 20:17, Geoff Macartney <ge...@apache.org> wrote:
>>> 
>>> Hi Alex,
>>> 
>>> Thanks for the link, will have a look at the PR.
>>> 
>>> As for the new shorthand proposal, +1 from me. I think it will be a
>>> majority of cases when things *are* simple - setting sensors, etc.
>>> Your example above goes from eight lines of markup for the steps to
>>> two, I think that's a compelling argument in itself in favour of the
>>> proposal.
>>> 
>>> Cheers
>>> Geoff
>>> 
>>> On Thu, 8 Sept 2022 at 09:35, Alex Heneveld <al...@cloudsoft.io> wrote:
>>>> 
>>>> Hi team,
>>>> 
>>>> An initial PR with a few types and the ability to define an effector is
>>>> available [1].
>>>> 
>>>> This is enough for the next steps to be parallelized, e.g. new steps
>>>> added.  The proposal has been updated with a work plan / list of tasks
>>>> [2].  Any volunteers to help with some of the upcoming tasks let me know.
>>>> 
>>>> Finally I've been thinking about the "shorthand syntax" and how to bring us
>>>> closer to Peter's proposal of a DSL.  The original proposal allowed instead
>>>> of a map e.g.
>>>> 
>>>> step_sleep:
>>>> type: sleep
>>>> duration: 5s
>>>> 
>>>> or
>>>> 
>>>> step_update_service_up:
>>>> type: set-sensor
>>>> sensor:
>>>>   name: service.isUp
>>>>   type: boolean
>>>> value: true
>>>> 
>>>> being able to use a shorthand _map_ with a single key being the type, and
>>>> value interpreted by that type, so in the OLD SHORTHAND PROPOSAL the above
>>>> could be written:
>>>> 
>>>> step_sleep:
>>>> sleep: 5s
>>>> 
>>>> step_update_service_up:
>>>> set-sensor: service.isUp = true
>>>> 
>>>> Having played with syntaxes a bit I wonder if we should instead say the
>>>> shorthand DSL kicks in when the step _body_ is a string (instead of a
>>>> single-key map), and the first word of the string being the type, and the
>>>> remainder interpreted by the type, and we allow it to be a bit more
>>>> ambitious.
>>>> 
>>>> Concretely this NEW SHORTHAND PROPOSAL would look something like:
>>>> 
>>>> step_sleep: sleep 5s
>>>> step_update_service_up: set-sensor service.isUp = true
>>>> # also supporting a type, ie `set-sensor [TYPE] NAME = VALUE`, eg
>>>> step_update_service_up: set-sensor boolean service.isUp = true
>>>> 
>>>> You would still need the full map syntax whenever defining flow logic -- eg
>>>> condition, next, retry, or timeout -- or any property not supported by the
>>>> shorthand syntax.  But for the (majority?) simple cases the expression
>>>> would be very concise.  In most cases I think it would feel like a DSL but
>>>> has the virtue of a very clear translation to the actual workflow model and
>>>> the underlying (YAML) model needed for resumption and UI.
>>>> 
>>>> As a final example, the example used at the start of the proposal
>>>> (simplified a little -- removing on-error retry and env map as those
>>>> wouldn't be supported by shorthand):
>>>> 
>>>> brooklyn.initializers:
>>>> - type: workflow-effector
>>>> name: run-spark-on-gcp
>>>> steps:
>>>>  1:
>>>>     type: container
>>>>     image: my/google-cloud
>>>>     command: gcloud dataproc jobs submit spark
>>>> --BUCKET=gs://$brooklyn:config("bucket")
>>>>   2:
>>>>     type: set-sensor
>>>>     sensor: spark-output
>>>>     value: ${1.stdout}
>>>> 
>>>> Could be written in this shorthand as follows:
>>>> 
>>>> steps:
>>>>  1: container my/google-cloud command "gcloud dataproc jobs submit spark
>>>> --BUCKET=gs://${entity.config.bucket}"
>>>>  2: set-sensor spark-output ${1.stdout}
>>>> 
>>>> Thoughts?
>>>> 
>>>> Best
>>>> Alex
>>>> 
>>>> 
>>>> [1] https://github.com/apache/brooklyn-server/pull/1358
>>>> [2]
>>>> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.gbadaqa2yql6
>>>> 
>>>> 
>>>> On Wed, 7 Sept 2022 at 09:58, Alex Heneveld <al...@cloudsoft.io> wrote:
>>>> 
>>>>> Hi Peter,
>>>>> 
>>>>> Yes - thanks for the extra details.  I did take your suggestion to be a
>>>>> procedural DSL not YAML, per the illustration at [1] (second code block).
>>>>> Probably where I was confusing was in saying that unlike DSLs which just
>>>>> run (and where the execution can be delegated to eg java/groovy/ruby), here
>>>>> we need to understand and display, store and resume the workflow progress.
>>>>> So I think it needs to be compiled to some representation that is well
>>>>> described and that new Apache Brooklyn code can reason about, both in the
>>>>> UI (JS) and backend (Java).  Parsing a DSL is much harder than using YAML
>>>>> for this "reasonable" representation (as in we can reason _about_ it :) ),
>>>>> because we already have good backend processing, persistence,
>>>>> serialization; and frontend processing and visualization support for
>>>>> YAML-based models.  So I think we almost definitely want a well-described
>>>>> declarative YAML model of the workflow.
>>>>> 
>>>>> We might *also* want a Workflow DSL because I agree with you a DSL would
>>>>> be nicer for a user to write (if writing by hand; although if composing
>>>>> visually a drag-and-drop to YAML is probably easier).  However it should
>>>>> probably get "compiled" into a Workflow YAML.  So I'm suggesting we do the
>>>>> workflow YAML at this stage, and a DSL that compiles into that YAML can be
>>>>> designed later.  (Designing a good DSL and parser and reason-about-able
>>>>> representation is a big task, so being able to separate it feels good too!)
>>>>> 
>>>>> Best
>>>>> Alex
>>>>> 
>>>>> [1]
>>>>> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.75wm48pjvx0h
>>>>> 
>>>>> 
>>>>> On Fri, 2 Sept 2022 at 20:17, Geoff Macartney <ge...@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Hi Peter,
>>>>>> 
>>>>>> Thanks for such a detailed writeup of how you see this working. I fear
>>>>>> I've too little experience with this sort of thing to be able to say
>>>>>> anything very useful about it. My thought on the matter would be,
>>>>>> let's get started with the yaml based approach and see how it goes. I
>>>>>> think that experience would then give us a much better feel for what a
>>>>>> really nice and usable DSL for workflows would look like (probably to
>>>>>> address all the pain points of the yaml approach! :-)   The outline
>>>>>> above will then be a good starting point, I'm sure.
>>>>>> 
>>>>>> Cheers
>>>>>> Geoff
>>>>>> 
>>>>>> On Thu, 1 Sept 2022 at 21:26, Peter Abramowitsch
>>>>>> <pa...@gmail.com> wrote:
>>>>>>> 
>>>>>>> Hi All
>>>>>>> I just wanted to clarify something in my comment the other day about
>>>>>> DSLs
>>>>>>> since I see that the acronym was also used in Alex's original document.
>>>>>>> Unless I misunderstood, Alex was proposing to create a DSL for Brooklyn
>>>>>>> using yaml as syntax and writing a code layer to translate between that
>>>>>>> syntax and underlying APIs which are presumably all in Java.
>>>>>>> 
>>>>>>> What I was suggesting was a DSL written directly in  Java (I guess)
>>>>>> whose
>>>>>>> syntax would be that language, but whose grammar would be keywords that
>>>>>>> were also Java functions.  Some of these functions would be pre-defined
>>>>>> in
>>>>>>> the DSL, while others could be  defined by the user and could use other
>>>>>>> functions of the DSL.    The result would be turned into a JAR file (or
>>>>>>> equivalent in another platform)   But during the compile phase, it
>>>>>> would be
>>>>>>> checked for errors, and it could be debugged line by line either
>>>>>> invoking
>>>>>>> live functionality or using a library of mock versions of the Brooklyn
>>>>>> API.
>>>>>>> 
>>>>>>> In this 'native' DSL one could provide different types of workflow
>>>>>>> constructs as functions (In the BaseClass), taking function names as
>>>>>> method
>>>>>>> pointers, or using Lambdas.  It would be a lot easier in Ruby or Python
>>>>>>> 
>>>>>>> // linear
>>>>>>> brooklynRun(NamedTaskMethod, NamedTaskMethod)
>>>>>>> 
>>>>>>> // chained
>>>>>>> TaskMethodA()TaskMethodB().
>>>>>>> 
>>>>>>> // asynchronous
>>>>>>> brooklynJoin(NamedTaskMethod, NamedTaskMethod,...)
>>>>>>> 
>>>>>>> // conditional
>>>>>>> brooklynRunIf(NamedTaskMethod, NamedConditionMethod,...)
>>>>>>> 
>>>>>>> // iterative
>>>>>>> brooklynRunWhile(NamedTaskMethod, NamedConditionMethod,...)
>>>>>>> brooklynRunUntil(NamedTaskMethod, NamedConditionMethod,...)
>>>>>>> 
>>>>>>> // there could even be a utility to implement legacy syntax (this of
>>>>>> course
>>>>>>> would require the extra code layer I was trying to avoid)
>>>>>>> runYaml(Path)
>>>>>>> 
>>>>>>> A basic class structure might be
>>>>>>> 
>>>>>>> // where BrooklynRecipeBase implements the utility functions including,
>>>>>>> among others  Join, Run, If, While, Until mentioned above
>>>>>>> // and the BrooklynWorkflowInterface would dictate the functional
>>>>>>> requirements for the mandatory aspects of the Recipe.
>>>>>>> class MyRecipe extends BrooklynRecipeBase implements,
>>>>>>> BrooklynWorkflowInterface
>>>>>>> {
>>>>>>> Initialize()
>>>>>>> createContext()   - spin up resources
>>>>>>> workflow() - the main launch sequence using aspects of the DSL
>>>>>>> monitoring() - an asynchronous workflow used to manage sensor output or
>>>>>> for
>>>>>>> whatever needs to be done while the "orchestra" is plating
>>>>>>> shutdownHook() - called whenever shutdown is happening
>>>>>>> }
>>>>>>> 
>>>>>>> For those who don't like the smell of Java, the source file could just
>>>>>> be
>>>>>>> the contents, which would then be injected into the class framing code
>>>>>>> before compilation.
>>>>>>> 
>>>>>>> These are just ideas.  I'm not familiar enough with Brooklyn in its
>>>>>> current
>>>>>>> implementation to be able to create realistic pseudocode.
>>>>>>> 
>>>>>>> Peter
>>>>>>> 
>>>>>>> On Thu, Sep 1, 2022 at 9:24 AM Geoff Macartney <
>>>>>> geoff.macartney@gmail.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Alex,
>>>>>>>> 
>>>>>>>> That's great, I'll be excited to hear all about it.  7th September
>>>>>>>> suits me fine; I would probably prefer 4.00 p.m. over 11.00.
>>>>>>>> 
>>>>>>>> Cheers
>>>>>>>> Geoff
>>>>>>>> 
>>>>>>>> On Thu, 1 Sept 2022 at 12:41, Alex Heneveld <al...@cloudsoft.io>
>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> Thanks for the excellent feedback Geoff and yes there are some very
>>>>>> cool
>>>>>>>> and exciting things added recently -- containers, conditions, and
>>>>>> terraform
>>>>>>>> and kubernetes support, all of which make writing complex blueprints
>>>>>> much
>>>>>>>> easier.
>>>>>>>>> 
>>>>>>>>> I'd love to host a session to showcase these.
>>>>>>>>> 
>>>>>>>>> How does Wed 7 Sept sound?  I could do 11am UK or 4pm UK --
>>>>>> depending
>>>>>>>> what time suits for people who are interested.  Please RSVP and
>>>>>> indicate
>>>>>>>> your time preference!
>>>>>>>>> 
>>>>>>>>> Best
>>>>>>>>> Alex
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Wed, 31 Aug 2022 at 22:17, Geoff Macartney <
>>>>>> geoff.macartney@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi Alex,
>>>>>>>>>> 
>>>>>>>>>> Another thought occurred to me when reading that workflow
>>>>>> proposal. You
>>>>>>>> wrote
>>>>>>>>>> 
>>>>>>>>>> "and with the recent support for container-based tasks and
>>>>>> declarative
>>>>>>>>>> conditions, we have taken big steps towards enabling YAML
>>>>>> authorship"
>>>>>>>>>> 
>>>>>>>>>> Unfortunately over the past while I haven't been able to keep up as
>>>>>>>>>> closely as I would like with developments in Brooklyn. I'm just
>>>>>>>>>> wondering if it might be possible to get together some time, on
>>>>>> Google
>>>>>>>>>> Meet or Zoom or whatnot, if you or a colleague could spare half an
>>>>>>>>>> hour to demo some of these recent developments? But don't worry
>>>>>> about
>>>>>>>>>> it if you're too busy at present.
>>>>>>>>>> 
>>>>>>>>>> Adding dev@ to this in CC for the sake of Openness. Others might
>>>>>> also
>>>>>>>>>> be interested!
>>>>>>>>>> 
>>>>>>>>>> Cheers
>>>>>>>>>> Geoff
>>>>>>>> 
>>>>>> 
>>>>> 
>>

Re: Declarative Workflow update & shorthand/DSL

Posted by Geoff Macartney <ge...@gmail.com>.

Hi Mykola,

I'd have thought it would be worth giving it some time to review,
given that it is such a significant change. I'm personally of the
opinion that significant changes like this shouldn't be merged without
review. I was hoping to do some review on it this week myself, though
I don't really get time to work on Brooklyn these days, so I was also
hoping that others would review it who are still close to the code.
Did you get a chance to look through it yourself?

What do you all think? I guess if it is in place and doesn't break
anything then it's OK?

Geoff



On Mon, 12 Sept 2022 at 09:04, Mykola Mandra
<my...@cloudsoftcorp.com> wrote:
>
> Hell All,
>
> Was I too hasty to merge the PR - https://github.com/apache/brooklyn-server/pull/1358? <https://github.com/apache/brooklyn-server/pull/1358?>
> We agreed to split the some of the work with Alex, and I did not see why it could not be merged to continue with what I’m working on.
>
> Regards,
> Mykola
>
> > On 9 Sep 2022, at 20:17, Geoff Macartney <ge...@apache.org> wrote:
> >
> > Hi Alex,
> >
> > Thanks for the link, will have a look at the PR.
> >
> > As for the new shorthand proposal, +1 from me. I think it will be a
> > majority of cases when things *are* simple - setting sensors, etc.
> > Your example above goes from eight lines of markup for the steps to
> > two, I think that's a compelling argument in itself in favour of the
> > proposal.
> >
> > Cheers
> > Geoff
> >
> > On Thu, 8 Sept 2022 at 09:35, Alex Heneveld <al...@cloudsoft.io> wrote:
> >>
> >> Hi team,
> >>
> >> An initial PR with a few types and the ability to define an effector is
> >> available [1].
> >>
> >> This is enough for the next steps to be parallelized, e.g. new steps
> >> added.  The proposal has been updated with a work plan / list of tasks
> >> [2].  Any volunteers to help with some of the upcoming tasks let me know.
> >>
> >> Finally I've been thinking about the "shorthand syntax" and how to bring us
> >> closer to Peter's proposal of a DSL.  The original proposal allowed instead
> >> of a map e.g.
> >>
> >> step_sleep:
> >>  type: sleep
> >>  duration: 5s
> >>
> >> or
> >>
> >> step_update_service_up:
> >>  type: set-sensor
> >>  sensor:
> >>    name: service.isUp
> >>    type: boolean
> >>  value: true
> >>
> >> being able to use a shorthand _map_ with a single key being the type, and
> >> value interpreted by that type, so in the OLD SHORTHAND PROPOSAL the above
> >> could be written:
> >>
> >> step_sleep:
> >>  sleep: 5s
> >>
> >> step_update_service_up:
> >>  set-sensor: service.isUp = true
> >>
> >> Having played with syntaxes a bit I wonder if we should instead say the
> >> shorthand DSL kicks in when the step _body_ is a string (instead of a
> >> single-key map), and the first word of the string being the type, and the
> >> remainder interpreted by the type, and we allow it to be a bit more
> >> ambitious.
> >>
> >> Concretely this NEW SHORTHAND PROPOSAL would look something like:
> >>
> >> step_sleep: sleep 5s
> >> step_update_service_up: set-sensor service.isUp = true
> >> # also supporting a type, ie `set-sensor [TYPE] NAME = VALUE`, eg
> >> step_update_service_up: set-sensor boolean service.isUp = true
> >>
> >> You would still need the full map syntax whenever defining flow logic -- eg
> >> condition, next, retry, or timeout -- or any property not supported by the
> >> shorthand syntax.  But for the (majority?) simple cases the expression
> >> would be very concise.  In most cases I think it would feel like a DSL but
> >> has the virtue of a very clear translation to the actual workflow model and
> >> the underlying (YAML) model needed for resumption and UI.
> >>
> >> As a final example, the example used at the start of the proposal
> >> (simplified a little -- removing on-error retry and env map as those
> >> wouldn't be supported by shorthand):
> >>
> >> brooklyn.initializers:
> >> - type: workflow-effector
> >> name: run-spark-on-gcp
> >> steps:
> >>   1:
> >>      type: container
> >>      image: my/google-cloud
> >>      command: gcloud dataproc jobs submit spark
> >> --BUCKET=gs://$brooklyn:config("bucket")
> >>    2:
> >>      type: set-sensor
> >>      sensor: spark-output
> >>      value: ${1.stdout}
> >>
> >> Could be written in this shorthand as follows:
> >>
> >> steps:
> >>   1: container my/google-cloud command "gcloud dataproc jobs submit spark
> >> --BUCKET=gs://${entity.config.bucket}"
> >>   2: set-sensor spark-output ${1.stdout}
> >>
> >> Thoughts?
> >>
> >> Best
> >> Alex
> >>
> >>
> >> [1] https://github.com/apache/brooklyn-server/pull/1358
> >> [2]
> >> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.gbadaqa2yql6
> >>
> >>
> >> On Wed, 7 Sept 2022 at 09:58, Alex Heneveld <al...@cloudsoft.io> wrote:
> >>
> >>> Hi Peter,
> >>>
> >>> Yes - thanks for the extra details.  I did take your suggestion to be a
> >>> procedural DSL not YAML, per the illustration at [1] (second code block).
> >>> Probably where I was confusing was in saying that unlike DSLs which just
> >>> run (and where the execution can be delegated to eg java/groovy/ruby), here
> >>> we need to understand and display, store and resume the workflow progress.
> >>> So I think it needs to be compiled to some representation that is well
> >>> described and that new Apache Brooklyn code can reason about, both in the
> >>> UI (JS) and backend (Java).  Parsing a DSL is much harder than using YAML
> >>> for this "reasonable" representation (as in we can reason _about_ it :) ),
> >>> because we already have good backend processing, persistence,
> >>> serialization; and frontend processing and visualization support for
> >>> YAML-based models.  So I think we almost definitely want a well-described
> >>> declarative YAML model of the workflow.
> >>>
> >>> We might *also* want a Workflow DSL because I agree with you a DSL would
> >>> be nicer for a user to write (if writing by hand; although if composing
> >>> visually a drag-and-drop to YAML is probably easier).  However it should
> >>> probably get "compiled" into a Workflow YAML.  So I'm suggesting we do the
> >>> workflow YAML at this stage, and a DSL that compiles into that YAML can be
> >>> designed later.  (Designing a good DSL and parser and reason-about-able
> >>> representation is a big task, so being able to separate it feels good too!)
> >>>
> >>> Best
> >>> Alex
> >>>
> >>> [1]
> >>> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.75wm48pjvx0h
> >>>
> >>>
> >>> On Fri, 2 Sept 2022 at 20:17, Geoff Macartney <ge...@gmail.com>
> >>> wrote:
> >>>
> >>>> Hi Peter,
> >>>>
> >>>> Thanks for such a detailed writeup of how you see this working. I fear
> >>>> I've too little experience with this sort of thing to be able to say
> >>>> anything very useful about it. My thought on the matter would be,
> >>>> let's get started with the yaml based approach and see how it goes. I
> >>>> think that experience would then give us a much better feel for what a
> >>>> really nice and usable DSL for workflows would look like (probably to
> >>>> address all the pain points of the yaml approach! :-)   The outline
> >>>> above will then be a good starting point, I'm sure.
> >>>>
> >>>> Cheers
> >>>> Geoff
> >>>>
> >>>> On Thu, 1 Sept 2022 at 21:26, Peter Abramowitsch
> >>>> <pa...@gmail.com> wrote:
> >>>>>
> >>>>> Hi All
> >>>>> I just wanted to clarify something in my comment the other day about
> >>>> DSLs
> >>>>> since I see that the acronym was also used in Alex's original document.
> >>>>> Unless I misunderstood, Alex was proposing to create a DSL for Brooklyn
> >>>>> using yaml as syntax and writing a code layer to translate between that
> >>>>> syntax and underlying APIs which are presumably all in Java.
> >>>>>
> >>>>> What I was suggesting was a DSL written directly in  Java (I guess)
> >>>> whose
> >>>>> syntax would be that language, but whose grammar would be keywords that
> >>>>> were also Java functions.  Some of these functions would be pre-defined
> >>>> in
> >>>>> the DSL, while others could be  defined by the user and could use other
> >>>>> functions of the DSL.    The result would be turned into a JAR file (or
> >>>>> equivalent in another platform)   But during the compile phase, it
> >>>> would be
> >>>>> checked for errors, and it could be debugged line by line either
> >>>> invoking
> >>>>> live functionality or using a library of mock versions of the Brooklyn
> >>>> API.
> >>>>>
> >>>>> In this 'native' DSL one could provide different types of workflow
> >>>>> constructs as functions (In the BaseClass), taking function names as
> >>>> method
> >>>>> pointers, or using Lambdas.  It would be a lot easier in Ruby or Python
> >>>>>
> >>>>> // linear
> >>>>> brooklynRun(NamedTaskMethod, NamedTaskMethod)
> >>>>>
> >>>>> // chained
> >>>>> TaskMethodA()TaskMethodB().
> >>>>>
> >>>>> // asynchronous
> >>>>> brooklynJoin(NamedTaskMethod, NamedTaskMethod,...)
> >>>>>
> >>>>> // conditional
> >>>>> brooklynRunIf(NamedTaskMethod, NamedConditionMethod,...)
> >>>>>
> >>>>> // iterative
> >>>>> brooklynRunWhile(NamedTaskMethod, NamedConditionMethod,...)
> >>>>> brooklynRunUntil(NamedTaskMethod, NamedConditionMethod,...)
> >>>>>
> >>>>> // there could even be a utility to implement legacy syntax (this of
> >>>> course
> >>>>> would require the extra code layer I was trying to avoid)
> >>>>> runYaml(Path)
> >>>>>
> >>>>> A basic class structure might be
> >>>>>
> >>>>> // where BrooklynRecipeBase implements the utility functions including,
> >>>>> among others  Join, Run, If, While, Until mentioned above
> >>>>> // and the BrooklynWorkflowInterface would dictate the functional
> >>>>> requirements for the mandatory aspects of the Recipe.
> >>>>> class MyRecipe extends BrooklynRecipeBase implements,
> >>>>> BrooklynWorkflowInterface
> >>>>> {
> >>>>> Initialize()
> >>>>> createContext()   - spin up resources
> >>>>> workflow() - the main launch sequence using aspects of the DSL
> >>>>> monitoring() - an asynchronous workflow used to manage sensor output or
> >>>> for
> >>>>> whatever needs to be done while the "orchestra" is plating
> >>>>> shutdownHook() - called whenever shutdown is happening
> >>>>> }
> >>>>>
> >>>>> For those who don't like the smell of Java, the source file could just
> >>>> be
> >>>>> the contents, which would then be injected into the class framing code
> >>>>> before compilation.
> >>>>>
> >>>>> These are just ideas.  I'm not familiar enough with Brooklyn in its
> >>>> current
> >>>>> implementation to be able to create realistic pseudocode.
> >>>>>
> >>>>> Peter
> >>>>>
> >>>>> On Thu, Sep 1, 2022 at 9:24 AM Geoff Macartney <
> >>>> geoff.macartney@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi Alex,
> >>>>>>
> >>>>>> That's great, I'll be excited to hear all about it.  7th September
> >>>>>> suits me fine; I would probably prefer 4.00 p.m. over 11.00.
> >>>>>>
> >>>>>> Cheers
> >>>>>> Geoff
> >>>>>>
> >>>>>> On Thu, 1 Sept 2022 at 12:41, Alex Heneveld <al...@cloudsoft.io>
> >>>> wrote:
> >>>>>>>
> >>>>>>> Thanks for the excellent feedback Geoff and yes there are some very
> >>>> cool
> >>>>>> and exciting things added recently -- containers, conditions, and
> >>>> terraform
> >>>>>> and kubernetes support, all of which make writing complex blueprints
> >>>> much
> >>>>>> easier.
> >>>>>>>
> >>>>>>> I'd love to host a session to showcase these.
> >>>>>>>
> >>>>>>> How does Wed 7 Sept sound?  I could do 11am UK or 4pm UK --
> >>>> depending
> >>>>>> what time suits for people who are interested.  Please RSVP and
> >>>> indicate
> >>>>>> your time preference!
> >>>>>>>
> >>>>>>> Best
> >>>>>>> Alex
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, 31 Aug 2022 at 22:17, Geoff Macartney <
> >>>> geoff.macartney@gmail.com>
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Hi Alex,
> >>>>>>>>
> >>>>>>>> Another thought occurred to me when reading that workflow
> >>>> proposal. You
> >>>>>> wrote
> >>>>>>>>
> >>>>>>>> "and with the recent support for container-based tasks and
> >>>> declarative
> >>>>>>>> conditions, we have taken big steps towards enabling YAML
> >>>> authorship"
> >>>>>>>>
> >>>>>>>> Unfortunately over the past while I haven't been able to keep up as
> >>>>>>>> closely as I would like with developments in Brooklyn. I'm just
> >>>>>>>> wondering if it might be possible to get together some time, on
> >>>> Google
> >>>>>>>> Meet or Zoom or whatnot, if you or a colleague could spare half an
> >>>>>>>> hour to demo some of these recent developments? But don't worry
> >>>> about
> >>>>>>>> it if you're too busy at present.
> >>>>>>>>
> >>>>>>>> Adding dev@ to this in CC for the sake of Openness. Others might
> >>>> also
> >>>>>>>> be interested!
> >>>>>>>>
> >>>>>>>> Cheers
> >>>>>>>> Geoff
> >>>>>>
> >>>>
> >>>
>

Re: Declarative Workflow update & shorthand/DSL

Posted by Mykola Mandra <my...@cloudsoftcorp.com>.

Hell All,

Was I too hasty to merge the PR - https://github.com/apache/brooklyn-server/pull/1358? <https://github.com/apache/brooklyn-server/pull/1358?>
We agreed to split the some of the work with Alex, and I did not see why it could not be merged to continue with what I’m working on.

Regards,
Mykola

> On 9 Sep 2022, at 20:17, Geoff Macartney <ge...@apache.org> wrote:
> 
> Hi Alex,
> 
> Thanks for the link, will have a look at the PR.
> 
> As for the new shorthand proposal, +1 from me. I think it will be a
> majority of cases when things *are* simple - setting sensors, etc.
> Your example above goes from eight lines of markup for the steps to
> two, I think that's a compelling argument in itself in favour of the
> proposal.
> 
> Cheers
> Geoff
> 
> On Thu, 8 Sept 2022 at 09:35, Alex Heneveld <al...@cloudsoft.io> wrote:
>> 
>> Hi team,
>> 
>> An initial PR with a few types and the ability to define an effector is
>> available [1].
>> 
>> This is enough for the next steps to be parallelized, e.g. new steps
>> added.  The proposal has been updated with a work plan / list of tasks
>> [2].  Any volunteers to help with some of the upcoming tasks let me know.
>> 
>> Finally I've been thinking about the "shorthand syntax" and how to bring us
>> closer to Peter's proposal of a DSL.  The original proposal allowed instead
>> of a map e.g.
>> 
>> step_sleep:
>>  type: sleep
>>  duration: 5s
>> 
>> or
>> 
>> step_update_service_up:
>>  type: set-sensor
>>  sensor:
>>    name: service.isUp
>>    type: boolean
>>  value: true
>> 
>> being able to use a shorthand _map_ with a single key being the type, and
>> value interpreted by that type, so in the OLD SHORTHAND PROPOSAL the above
>> could be written:
>> 
>> step_sleep:
>>  sleep: 5s
>> 
>> step_update_service_up:
>>  set-sensor: service.isUp = true
>> 
>> Having played with syntaxes a bit I wonder if we should instead say the
>> shorthand DSL kicks in when the step _body_ is a string (instead of a
>> single-key map), and the first word of the string being the type, and the
>> remainder interpreted by the type, and we allow it to be a bit more
>> ambitious.
>> 
>> Concretely this NEW SHORTHAND PROPOSAL would look something like:
>> 
>> step_sleep: sleep 5s
>> step_update_service_up: set-sensor service.isUp = true
>> # also supporting a type, ie `set-sensor [TYPE] NAME = VALUE`, eg
>> step_update_service_up: set-sensor boolean service.isUp = true
>> 
>> You would still need the full map syntax whenever defining flow logic -- eg
>> condition, next, retry, or timeout -- or any property not supported by the
>> shorthand syntax.  But for the (majority?) simple cases the expression
>> would be very concise.  In most cases I think it would feel like a DSL but
>> has the virtue of a very clear translation to the actual workflow model and
>> the underlying (YAML) model needed for resumption and UI.
>> 
>> As a final example, the example used at the start of the proposal
>> (simplified a little -- removing on-error retry and env map as those
>> wouldn't be supported by shorthand):
>> 
>> brooklyn.initializers:
>> - type: workflow-effector
>> name: run-spark-on-gcp
>> steps:
>>   1:
>>      type: container
>>      image: my/google-cloud
>>      command: gcloud dataproc jobs submit spark
>> --BUCKET=gs://$brooklyn:config("bucket")
>>    2:
>>      type: set-sensor
>>      sensor: spark-output
>>      value: ${1.stdout}
>> 
>> Could be written in this shorthand as follows:
>> 
>> steps:
>>   1: container my/google-cloud command "gcloud dataproc jobs submit spark
>> --BUCKET=gs://${entity.config.bucket}"
>>   2: set-sensor spark-output ${1.stdout}
>> 
>> Thoughts?
>> 
>> Best
>> Alex
>> 
>> 
>> [1] https://github.com/apache/brooklyn-server/pull/1358
>> [2]
>> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.gbadaqa2yql6
>> 
>> 
>> On Wed, 7 Sept 2022 at 09:58, Alex Heneveld <al...@cloudsoft.io> wrote:
>> 
>>> Hi Peter,
>>> 
>>> Yes - thanks for the extra details.  I did take your suggestion to be a
>>> procedural DSL not YAML, per the illustration at [1] (second code block).
>>> Probably where I was confusing was in saying that unlike DSLs which just
>>> run (and where the execution can be delegated to eg java/groovy/ruby), here
>>> we need to understand and display, store and resume the workflow progress.
>>> So I think it needs to be compiled to some representation that is well
>>> described and that new Apache Brooklyn code can reason about, both in the
>>> UI (JS) and backend (Java).  Parsing a DSL is much harder than using YAML
>>> for this "reasonable" representation (as in we can reason _about_ it :) ),
>>> because we already have good backend processing, persistence,
>>> serialization; and frontend processing and visualization support for
>>> YAML-based models.  So I think we almost definitely want a well-described
>>> declarative YAML model of the workflow.
>>> 
>>> We might *also* want a Workflow DSL because I agree with you a DSL would
>>> be nicer for a user to write (if writing by hand; although if composing
>>> visually a drag-and-drop to YAML is probably easier).  However it should
>>> probably get "compiled" into a Workflow YAML.  So I'm suggesting we do the
>>> workflow YAML at this stage, and a DSL that compiles into that YAML can be
>>> designed later.  (Designing a good DSL and parser and reason-about-able
>>> representation is a big task, so being able to separate it feels good too!)
>>> 
>>> Best
>>> Alex
>>> 
>>> [1]
>>> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.75wm48pjvx0h
>>> 
>>> 
>>> On Fri, 2 Sept 2022 at 20:17, Geoff Macartney <ge...@gmail.com>
>>> wrote:
>>> 
>>>> Hi Peter,
>>>> 
>>>> Thanks for such a detailed writeup of how you see this working. I fear
>>>> I've too little experience with this sort of thing to be able to say
>>>> anything very useful about it. My thought on the matter would be,
>>>> let's get started with the yaml based approach and see how it goes. I
>>>> think that experience would then give us a much better feel for what a
>>>> really nice and usable DSL for workflows would look like (probably to
>>>> address all the pain points of the yaml approach! :-)   The outline
>>>> above will then be a good starting point, I'm sure.
>>>> 
>>>> Cheers
>>>> Geoff
>>>> 
>>>> On Thu, 1 Sept 2022 at 21:26, Peter Abramowitsch
>>>> <pa...@gmail.com> wrote:
>>>>> 
>>>>> Hi All
>>>>> I just wanted to clarify something in my comment the other day about
>>>> DSLs
>>>>> since I see that the acronym was also used in Alex's original document.
>>>>> Unless I misunderstood, Alex was proposing to create a DSL for Brooklyn
>>>>> using yaml as syntax and writing a code layer to translate between that
>>>>> syntax and underlying APIs which are presumably all in Java.
>>>>> 
>>>>> What I was suggesting was a DSL written directly in  Java (I guess)
>>>> whose
>>>>> syntax would be that language, but whose grammar would be keywords that
>>>>> were also Java functions.  Some of these functions would be pre-defined
>>>> in
>>>>> the DSL, while others could be  defined by the user and could use other
>>>>> functions of the DSL.    The result would be turned into a JAR file (or
>>>>> equivalent in another platform)   But during the compile phase, it
>>>> would be
>>>>> checked for errors, and it could be debugged line by line either
>>>> invoking
>>>>> live functionality or using a library of mock versions of the Brooklyn
>>>> API.
>>>>> 
>>>>> In this 'native' DSL one could provide different types of workflow
>>>>> constructs as functions (In the BaseClass), taking function names as
>>>> method
>>>>> pointers, or using Lambdas.  It would be a lot easier in Ruby or Python
>>>>> 
>>>>> // linear
>>>>> brooklynRun(NamedTaskMethod, NamedTaskMethod)
>>>>> 
>>>>> // chained
>>>>> TaskMethodA()TaskMethodB().
>>>>> 
>>>>> // asynchronous
>>>>> brooklynJoin(NamedTaskMethod, NamedTaskMethod,...)
>>>>> 
>>>>> // conditional
>>>>> brooklynRunIf(NamedTaskMethod, NamedConditionMethod,...)
>>>>> 
>>>>> // iterative
>>>>> brooklynRunWhile(NamedTaskMethod, NamedConditionMethod,...)
>>>>> brooklynRunUntil(NamedTaskMethod, NamedConditionMethod,...)
>>>>> 
>>>>> // there could even be a utility to implement legacy syntax (this of
>>>> course
>>>>> would require the extra code layer I was trying to avoid)
>>>>> runYaml(Path)
>>>>> 
>>>>> A basic class structure might be
>>>>> 
>>>>> // where BrooklynRecipeBase implements the utility functions including,
>>>>> among others  Join, Run, If, While, Until mentioned above
>>>>> // and the BrooklynWorkflowInterface would dictate the functional
>>>>> requirements for the mandatory aspects of the Recipe.
>>>>> class MyRecipe extends BrooklynRecipeBase implements,
>>>>> BrooklynWorkflowInterface
>>>>> {
>>>>> Initialize()
>>>>> createContext()   - spin up resources
>>>>> workflow() - the main launch sequence using aspects of the DSL
>>>>> monitoring() - an asynchronous workflow used to manage sensor output or
>>>> for
>>>>> whatever needs to be done while the "orchestra" is plating
>>>>> shutdownHook() - called whenever shutdown is happening
>>>>> }
>>>>> 
>>>>> For those who don't like the smell of Java, the source file could just
>>>> be
>>>>> the contents, which would then be injected into the class framing code
>>>>> before compilation.
>>>>> 
>>>>> These are just ideas.  I'm not familiar enough with Brooklyn in its
>>>> current
>>>>> implementation to be able to create realistic pseudocode.
>>>>> 
>>>>> Peter
>>>>> 
>>>>> On Thu, Sep 1, 2022 at 9:24 AM Geoff Macartney <
>>>> geoff.macartney@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Hi Alex,
>>>>>> 
>>>>>> That's great, I'll be excited to hear all about it.  7th September
>>>>>> suits me fine; I would probably prefer 4.00 p.m. over 11.00.
>>>>>> 
>>>>>> Cheers
>>>>>> Geoff
>>>>>> 
>>>>>> On Thu, 1 Sept 2022 at 12:41, Alex Heneveld <al...@cloudsoft.io>
>>>> wrote:
>>>>>>> 
>>>>>>> Thanks for the excellent feedback Geoff and yes there are some very
>>>> cool
>>>>>> and exciting things added recently -- containers, conditions, and
>>>> terraform
>>>>>> and kubernetes support, all of which make writing complex blueprints
>>>> much
>>>>>> easier.
>>>>>>> 
>>>>>>> I'd love to host a session to showcase these.
>>>>>>> 
>>>>>>> How does Wed 7 Sept sound?  I could do 11am UK or 4pm UK --
>>>> depending
>>>>>> what time suits for people who are interested.  Please RSVP and
>>>> indicate
>>>>>> your time preference!
>>>>>>> 
>>>>>>> Best
>>>>>>> Alex
>>>>>>> 
>>>>>>> 
>>>>>>> On Wed, 31 Aug 2022 at 22:17, Geoff Macartney <
>>>> geoff.macartney@gmail.com>
>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Hi Alex,
>>>>>>>> 
>>>>>>>> Another thought occurred to me when reading that workflow
>>>> proposal. You
>>>>>> wrote
>>>>>>>> 
>>>>>>>> "and with the recent support for container-based tasks and
>>>> declarative
>>>>>>>> conditions, we have taken big steps towards enabling YAML
>>>> authorship"
>>>>>>>> 
>>>>>>>> Unfortunately over the past while I haven't been able to keep up as
>>>>>>>> closely as I would like with developments in Brooklyn. I'm just
>>>>>>>> wondering if it might be possible to get together some time, on
>>>> Google
>>>>>>>> Meet or Zoom or whatnot, if you or a colleague could spare half an
>>>>>>>> hour to demo some of these recent developments? But don't worry
>>>> about
>>>>>>>> it if you're too busy at present.
>>>>>>>> 
>>>>>>>> Adding dev@ to this in CC for the sake of Openness. Others might
>>>> also
>>>>>>>> be interested!
>>>>>>>> 
>>>>>>>> Cheers
>>>>>>>> Geoff
>>>>>> 
>>>> 
>>>

Re: Declarative Workflow update & shorthand/DSL

Posted by Geoff Macartney <ge...@apache.org>.

Hi Alex,

Thanks for the link, will have a look at the PR.

As for the new shorthand proposal, +1 from me. I think it will be a
majority of cases when things *are* simple - setting sensors, etc.
Your example above goes from eight lines of markup for the steps to
two, I think that's a compelling argument in itself in favour of the
proposal.

Cheers
Geoff

On Thu, 8 Sept 2022 at 09:35, Alex Heneveld <al...@cloudsoft.io> wrote:
>
> Hi team,
>
> An initial PR with a few types and the ability to define an effector is
> available [1].
>
> This is enough for the next steps to be parallelized, e.g. new steps
> added.  The proposal has been updated with a work plan / list of tasks
> [2].  Any volunteers to help with some of the upcoming tasks let me know.
>
> Finally I've been thinking about the "shorthand syntax" and how to bring us
> closer to Peter's proposal of a DSL.  The original proposal allowed instead
> of a map e.g.
>
> step_sleep:
>   type: sleep
>   duration: 5s
>
> or
>
> step_update_service_up:
>   type: set-sensor
>   sensor:
>     name: service.isUp
>     type: boolean
>   value: true
>
> being able to use a shorthand _map_ with a single key being the type, and
> value interpreted by that type, so in the OLD SHORTHAND PROPOSAL the above
> could be written:
>
> step_sleep:
>   sleep: 5s
>
> step_update_service_up:
>   set-sensor: service.isUp = true
>
> Having played with syntaxes a bit I wonder if we should instead say the
> shorthand DSL kicks in when the step _body_ is a string (instead of a
> single-key map), and the first word of the string being the type, and the
> remainder interpreted by the type, and we allow it to be a bit more
> ambitious.
>
> Concretely this NEW SHORTHAND PROPOSAL would look something like:
>
> step_sleep: sleep 5s
> step_update_service_up: set-sensor service.isUp = true
> # also supporting a type, ie `set-sensor [TYPE] NAME = VALUE`, eg
> step_update_service_up: set-sensor boolean service.isUp = true
>
> You would still need the full map syntax whenever defining flow logic -- eg
> condition, next, retry, or timeout -- or any property not supported by the
> shorthand syntax.  But for the (majority?) simple cases the expression
> would be very concise.  In most cases I think it would feel like a DSL but
> has the virtue of a very clear translation to the actual workflow model and
> the underlying (YAML) model needed for resumption and UI.
>
> As a final example, the example used at the start of the proposal
> (simplified a little -- removing on-error retry and env map as those
> wouldn't be supported by shorthand):
>
> brooklyn.initializers:
> - type: workflow-effector
>  name: run-spark-on-gcp
>  steps:
>    1:
>       type: container
>       image: my/google-cloud
>       command: gcloud dataproc jobs submit spark
> --BUCKET=gs://$brooklyn:config("bucket")
>     2:
>       type: set-sensor
>       sensor: spark-output
>       value: ${1.stdout}
>
> Could be written in this shorthand as follows:
>
>  steps:
>    1: container my/google-cloud command "gcloud dataproc jobs submit spark
> --BUCKET=gs://${entity.config.bucket}"
>    2: set-sensor spark-output ${1.stdout}
>
> Thoughts?
>
> Best
> Alex
>
>
> [1] https://github.com/apache/brooklyn-server/pull/1358
> [2]
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.gbadaqa2yql6
>
>
> On Wed, 7 Sept 2022 at 09:58, Alex Heneveld <al...@cloudsoft.io> wrote:
>
> > Hi Peter,
> >
> > Yes - thanks for the extra details.  I did take your suggestion to be a
> > procedural DSL not YAML, per the illustration at [1] (second code block).
> > Probably where I was confusing was in saying that unlike DSLs which just
> > run (and where the execution can be delegated to eg java/groovy/ruby), here
> > we need to understand and display, store and resume the workflow progress.
> > So I think it needs to be compiled to some representation that is well
> > described and that new Apache Brooklyn code can reason about, both in the
> > UI (JS) and backend (Java).  Parsing a DSL is much harder than using YAML
> > for this "reasonable" representation (as in we can reason _about_ it :) ),
> > because we already have good backend processing, persistence,
> > serialization; and frontend processing and visualization support for
> > YAML-based models.  So I think we almost definitely want a well-described
> > declarative YAML model of the workflow.
> >
> > We might *also* want a Workflow DSL because I agree with you a DSL would
> > be nicer for a user to write (if writing by hand; although if composing
> > visually a drag-and-drop to YAML is probably easier).  However it should
> > probably get "compiled" into a Workflow YAML.  So I'm suggesting we do the
> > workflow YAML at this stage, and a DSL that compiles into that YAML can be
> > designed later.  (Designing a good DSL and parser and reason-about-able
> > representation is a big task, so being able to separate it feels good too!)
> >
> > Best
> > Alex
> >
> > [1]
> > https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.75wm48pjvx0h
> >
> >
> > On Fri, 2 Sept 2022 at 20:17, Geoff Macartney <ge...@gmail.com>
> > wrote:
> >
> >> Hi Peter,
> >>
> >> Thanks for such a detailed writeup of how you see this working. I fear
> >> I've too little experience with this sort of thing to be able to say
> >> anything very useful about it. My thought on the matter would be,
> >> let's get started with the yaml based approach and see how it goes. I
> >> think that experience would then give us a much better feel for what a
> >> really nice and usable DSL for workflows would look like (probably to
> >> address all the pain points of the yaml approach! :-)   The outline
> >> above will then be a good starting point, I'm sure.
> >>
> >> Cheers
> >> Geoff
> >>
> >> On Thu, 1 Sept 2022 at 21:26, Peter Abramowitsch
> >> <pa...@gmail.com> wrote:
> >> >
> >> > Hi All
> >> > I just wanted to clarify something in my comment the other day about
> >> DSLs
> >> > since I see that the acronym was also used in Alex's original document.
> >> > Unless I misunderstood, Alex was proposing to create a DSL for Brooklyn
> >> > using yaml as syntax and writing a code layer to translate between that
> >> > syntax and underlying APIs which are presumably all in Java.
> >> >
> >> > What I was suggesting was a DSL written directly in  Java (I guess)
> >> whose
> >> > syntax would be that language, but whose grammar would be keywords that
> >> > were also Java functions.  Some of these functions would be pre-defined
> >> in
> >> > the DSL, while others could be  defined by the user and could use other
> >> > functions of the DSL.    The result would be turned into a JAR file (or
> >> > equivalent in another platform)   But during the compile phase, it
> >> would be
> >> > checked for errors, and it could be debugged line by line either
> >> invoking
> >> > live functionality or using a library of mock versions of the Brooklyn
> >> API.
> >> >
> >> > In this 'native' DSL one could provide different types of workflow
> >> > constructs as functions (In the BaseClass), taking function names as
> >> method
> >> > pointers, or using Lambdas.  It would be a lot easier in Ruby or Python
> >> >
> >> > // linear
> >> > brooklynRun(NamedTaskMethod, NamedTaskMethod)
> >> >
> >> > // chained
> >> > TaskMethodA()TaskMethodB().
> >> >
> >> > // asynchronous
> >> > brooklynJoin(NamedTaskMethod, NamedTaskMethod,...)
> >> >
> >> > // conditional
> >> > brooklynRunIf(NamedTaskMethod, NamedConditionMethod,...)
> >> >
> >> > // iterative
> >> > brooklynRunWhile(NamedTaskMethod, NamedConditionMethod,...)
> >> > brooklynRunUntil(NamedTaskMethod, NamedConditionMethod,...)
> >> >
> >> > // there could even be a utility to implement legacy syntax (this of
> >> course
> >> > would require the extra code layer I was trying to avoid)
> >> > runYaml(Path)
> >> >
> >> > A basic class structure might be
> >> >
> >> > // where BrooklynRecipeBase implements the utility functions including,
> >> > among others  Join, Run, If, While, Until mentioned above
> >> > // and the BrooklynWorkflowInterface would dictate the functional
> >> > requirements for the mandatory aspects of the Recipe.
> >> > class MyRecipe extends BrooklynRecipeBase implements,
> >> > BrooklynWorkflowInterface
> >> > {
> >> > Initialize()
> >> > createContext()   - spin up resources
> >> > workflow() - the main launch sequence using aspects of the DSL
> >> > monitoring() - an asynchronous workflow used to manage sensor output or
> >> for
> >> > whatever needs to be done while the "orchestra" is plating
> >> > shutdownHook() - called whenever shutdown is happening
> >> > }
> >> >
> >> > For those who don't like the smell of Java, the source file could just
> >> be
> >> > the contents, which would then be injected into the class framing code
> >> > before compilation.
> >> >
> >> > These are just ideas.  I'm not familiar enough with Brooklyn in its
> >> current
> >> > implementation to be able to create realistic pseudocode.
> >> >
> >> > Peter
> >> >
> >> > On Thu, Sep 1, 2022 at 9:24 AM Geoff Macartney <
> >> geoff.macartney@gmail.com>
> >> > wrote:
> >> >
> >> > > Hi Alex,
> >> > >
> >> > > That's great, I'll be excited to hear all about it.  7th September
> >> > > suits me fine; I would probably prefer 4.00 p.m. over 11.00.
> >> > >
> >> > > Cheers
> >> > > Geoff
> >> > >
> >> > > On Thu, 1 Sept 2022 at 12:41, Alex Heneveld <al...@cloudsoft.io>
> >> wrote:
> >> > > >
> >> > > > Thanks for the excellent feedback Geoff and yes there are some very
> >> cool
> >> > > and exciting things added recently -- containers, conditions, and
> >> terraform
> >> > > and kubernetes support, all of which make writing complex blueprints
> >> much
> >> > > easier.
> >> > > >
> >> > > > I'd love to host a session to showcase these.
> >> > > >
> >> > > > How does Wed 7 Sept sound?  I could do 11am UK or 4pm UK --
> >> depending
> >> > > what time suits for people who are interested.  Please RSVP and
> >> indicate
> >> > > your time preference!
> >> > > >
> >> > > > Best
> >> > > > Alex
> >> > > >
> >> > > >
> >> > > > On Wed, 31 Aug 2022 at 22:17, Geoff Macartney <
> >> geoff.macartney@gmail.com>
> >> > > wrote:
> >> > > >>
> >> > > >> Hi Alex,
> >> > > >>
> >> > > >> Another thought occurred to me when reading that workflow
> >> proposal. You
> >> > > wrote
> >> > > >>
> >> > > >> "and with the recent support for container-based tasks and
> >> declarative
> >> > > >> conditions, we have taken big steps towards enabling YAML
> >> authorship"
> >> > > >>
> >> > > >> Unfortunately over the past while I haven't been able to keep up as
> >> > > >> closely as I would like with developments in Brooklyn. I'm just
> >> > > >> wondering if it might be possible to get together some time, on
> >> Google
> >> > > >> Meet or Zoom or whatnot, if you or a colleague could spare half an
> >> > > >> hour to demo some of these recent developments? But don't worry
> >> about
> >> > > >> it if you're too busy at present.
> >> > > >>
> >> > > >> Adding dev@ to this in CC for the sake of Openness. Others might
> >> also
> >> > > >> be interested!
> >> > > >>
> >> > > >> Cheers
> >> > > >> Geoff
> >> > >
> >>
> >