You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Alex Merose via dev <de...@beam.apache.org> on 2022/07/08 23:03:44 UTC

RE: Join a meeting to help coordinate implementing a Dask Runner for Beam

Dear Beam & Dask communities,

Together with Pablo and Charles, I've hacked together an initial prototype
of a Dask runner for Beam. I'm happy to announce that I have minimum viable
working version in a fork here: https://github.com/alxmrs/beam/pull/1

There's definitely more work to do here – more operations to implement,
tests to write, style guides to follow, etc. However, I'm pleased that
there are enough operations implemented to run test pipelines with
assertions.

From here, what are good next steps?

Best,
Alex

PS – Meeting / design notes are available in this doc:
https://docs.google.com/document/d/1Awj_eNmH-WRSte3bKcCcUlQDiZ5mMKmCO_xV-mHWAak/edit#heading=h.y0pwg4polebc

On 2022/06/08 14:22:41 Ryan Abernathey wrote:
> Dear Beamer,
>
> Thank you for all of your work on this amazing project. I am new to Beam
> and am quite excited about its potential to help with some data processing
> challenges in my field of climate science.
>
> Our community is interested in running Beam on Dask Distributed clusters,
> which we already know how to deploy. This has been discussed at
> https://issues.apache.org/jira/browse/BEAM-5336 and
> https://github.com/apache/beam/issues/18962. It seems technically
feasible.
>
> We are trying to organize a meeting next week to kickstart and coordinate
> this effort. It would be great if we could entrain some Beam maintainers
> into this meeting. If you have interest in this topic and are available
> next week, please share your availability here -
> https://www.when2meet.com/?15861604-jLnA4
>
> Alternatively, if you have any guidance or suggestions you wish to provide
> by email or GitHub discussion, we welcome your input.
>
> Thanks again for your open source work.
>
> Best,
> Ryan Abernathey
>