You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Rose Nguyen <rt...@google.com> on 2018/11/15 22:08:19 UTC

Apache Beam Newsletter - November 2018

[image: Beam.png]

November 2018 | Newsletter

What’s been done

Beam Community Metrics (by: Mikhail Gryzykhin, Udi Meiri, Huygaa Batsaikhan)

   -

   To help track project health status, added dashboarding platform.
   -

   Initial dashboards were created that aim at tracking pre- and
   post-commit tests status and engineering load.
   -

   Leave feedback under BEAM-5862
   -

   View the dashboards here <https://s.apache.org/beam-community-metrics>

Apache Beam 2.8.0 released! (by: many contributors)

   -

   Major new features and improvements, such as Python on Flink MVP
   -

   You can download the release here
   <https://beam.apache.org/get-started/downloads/>.
   -

   See the blog post
   <https://beam.apache.org/blog/2018/10/29/beam-2.8.0.html> for more
   details.


New Edit button on beam.apache.org pages (by: Alan Myrvold, Scott Wegner)

   -

   To make it easier for non-committers  to update documentation, an edit
   button has been added on https://beam.apache.org pages to help create a
   pull request using the GitHub web UI.
   -

   See BEAM-4431 for more details.

RabbitMqIO (by: Jean-Baptiste Onofré)

   -

   A IO to publish or consume messages with a RabbitMQ broker


Graphite sink for metrics (by: Etienne Chauchot)

   -

   Metrics Pusher can now export Beam metrics to Graphite


BeamSQL (by: Rui Wang, Mingming Xu)

   -

   Add 13 built-in SQL functions.
   -

   Enable function overloading for UDF by a new UDF registration approach.
   -

   UDF supports Joda DateTime as argument type.

What we're working on...

Flink Portable Runner (by: Ankur Goenka, Maximilian Michels, Thomas Weise,
Ryan Williams, Robert Bradshaw)

   -

   Integration of timers in user functions for streaming and batch execution
   -

   Enabling TFX pipelines to run on Flink
   -

   Investigating the integration of metrics
   -

   Bug fixes


Load tests of Core Apache Beam Operations (by: Łukasz Gajowy, Katarzyna
Kucharczyk)

   -

   Test operations such as GroupByKey, ParDo, Combine etc in stressful
   conditions.
   -

   See https://s.apache.org/load-test-basic-operations for more details on
    how it works.

New Members
New Committers

   -

   David Morávek, Pilsen, Czech Republic
   -

      Using Beam for an internet scale web crawler
      -

      See BEAM-3900 for more details.

Talks & Meetups

Hadoop User Group @ Warsaw, Poland

   -

   Apache Beam - what do I gain? by Łukasz Gajowy (link to the meetup
   <https://www.meetup.com/warsaw-hug/events/255227113/>)
   -

   We discussed the basics of the Dataflow model,  Beam in more detail, and
   familiarized the audience with the current state of the project


Resources

Blog Post on London Summit (by: Matthias Baetens)

   -

   “Inaugural edition of the Beam Summit Europe 2018 - aftermath”- a recap
   of the conference, including the presentation slide decks.
   -

   See the post here
   <https://beam.apache.org/blog/2018/10/31/beam-summit-aftermath.html> and
   videos of the sessions on the Apache Beam YouTube channel
   <https://www.youtube.com/c/apachebeam>.

How to transfer BigQuery tables between locations (by: Graham Polley)

   -

   A Cloud Dataflow solution in Java for transferring BigQuery tables
   including source code
   <https://github.com/polleyg/gcp-dataflow-copy-bigquery>.
   -

   See the Medium article here
   <https://medium.com/weareservian/how-to-transfer-bigquery-tables-between-locations-with-cloud-dataflow-9582acc6ae1d>
   .


Hands on Apache Beam, building data pipelines in Python (by: Graham Polley)

   -

   Writing a Beam pipeline in Python to compute the mean of the Open and
   Close columns for a historical S&P 500 dataset.
   -

   See the Medium Towards Data Science article here
   <https://towardsdatascience.com/hands-on-apache-beam-building-data-pipelines-in-python-6548898b66a5>
   and GitHub tutorial here
   <https://github.com/vincentteyssier/apache-beam-tutorial>.


*Until Next Time!*

*This edition was curated by our community of contributors, committers and
PMCs. It contains work done in November 2018 and ongoing efforts. We hope
to provide visibility to what's going on in the community, so if you have
questions, feel free to ask in this thread.*
-- 
Rose Thị Nguyễn