You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by gi...@apache.org on 2020/09/18 00:41:48 UTC

[beam] branch asf-site updated: Publishing website 2020/09/18 00:41:21 at commit f4c2734

This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 389272c  Publishing website 2020/09/18 00:41:21 at commit f4c2734
389272c is described below

commit 389272c63844360455899a19e3cc673aee1ec5f0
Author: jenkins <bu...@apache.org>
AuthorDate: Fri Sep 18 00:41:30 2020 +0000

    Publishing website 2020/09/18 00:41:21 at commit f4c2734
---
 .../blog/2020/08/27/pattern-match-beam-sql.html    |   1 +
 website/generated-content/blog/index.html          |   4 +-
 website/generated-content/blog/index.xml           | 215 ++++++++++++++--
 .../blog/pattern-match-beam-sql/index.html         |  56 +++++
 .../generated-content/categories/blog/index.xml    | 206 +++++++++++++--
 website/generated-content/categories/index.xml     |   2 +-
 website/generated-content/feed.xml                 | 278 ++++++++++++++-------
 website/generated-content/index.html               |   2 +-
 website/generated-content/sitemap.xml              |   2 +-
 9 files changed, 633 insertions(+), 133 deletions(-)

diff --git a/website/generated-content/blog/2020/08/27/pattern-match-beam-sql.html b/website/generated-content/blog/2020/08/27/pattern-match-beam-sql.html
new file mode 100644
index 0000000..c938f7b
--- /dev/null
+++ b/website/generated-content/blog/2020/08/27/pattern-match-beam-sql.html
@@ -0,0 +1 @@
+<!doctype html><html><head><title>/blog/pattern-match-beam-sql/</title><link rel=canonical href=/blog/pattern-match-beam-sql/><meta name=robots content="noindex"><meta charset=utf-8><meta http-equiv=refresh content="0; url=/blog/pattern-match-beam-sql/"></head></html>
\ No newline at end of file
diff --git a/website/generated-content/blog/index.html b/website/generated-content/blog/index.html
index f43817f..0cffecb 100644
--- a/website/generated-content/blog/index.html
+++ b/website/generated-content/blog/index.html
@@ -1,7 +1,9 @@
 <!doctype html><html lang=en class=no-js><head><meta charset=utf-8><meta http-equiv=x-ua-compatible content="IE=edge"><meta name=viewport content="width=device-width,initial-scale=1"><title>Blogs</title><meta name=description content="Apache Beam is an open source, unified model and set of language-specific SDKs for defining and executing data processing workflows, and also data ingestion and integration flows, supporting Enterprise Integration Patterns (EIPs) and Domain Specific Languag [...]
 <span class=sr-only>Toggle navigation</span>
 <span class=icon-bar></span><span class=icon-bar></span><span class=icon-bar></span></button>
-<a href=/ class=navbar-brand><img alt=Brand style=height:25px src=/images/beam_logo_navbar.png></a></div><div class="navbar-mask closed"></div><div id=navbar class="navbar-container closed"><ul class="nav navbar-nav"><li><a href=/get-started/beam-overview/>Get Started</a></li><li><a href=/documentation/>Documentation</a></li><li><a href=/documentation/sdks/java/>Languages</a></li><li><a href=/documentation/runners/capability-matrix/>RUNNERS</a></li><li><a href=/roadmap/>Roadmap</a></li>< [...]
+<a href=/ class=navbar-brand><img alt=Brand style=height:25px src=/images/beam_logo_navbar.png></a></div><div class="navbar-mask closed"></div><div id=navbar class="navbar-container closed"><ul class="nav navbar-nav"><li><a href=/get-started/beam-overview/>Get Started</a></li><li><a href=/documentation/>Documentation</a></li><li><a href=/documentation/sdks/java/>Languages</a></li><li><a href=/documentation/runners/capability-matrix/>RUNNERS</a></li><li><a href=/roadmap/>Roadmap</a></li>< [...]
+•
+Mark-Zeng</i></p>Introduction SQL is becoming increasingly powerful and useful in the field of data analysis.<p><a class="btn btn-default btn-sm" href=/blog/pattern-match-beam-sql/ role=button>Read more&nbsp;<span class="glyphicon glyphicon-menu-right" aria-hidden=true></span></a></p><hr><h3><a class=post-link href=/blog/python-improved-annotations/>Improved Annotation Support for the Python SDK</a></h3><p><i>Aug 21, 2020
 •
 Saavan Nanavati</i></p>The importance of static type checking in a dynamically typed language like Python is not up for debate.<p><a class="btn btn-default btn-sm" href=/blog/python-improved-annotations/ role=button>Read more&nbsp;<span class="glyphicon glyphicon-menu-right" aria-hidden=true></span></a></p><hr><h3><a class=post-link href=/blog/python-performance-runtime-type-checking/>Performance-Driven Runtime Type Checking for the Python SDK</a></h3><p><i>Aug 21, 2020
 •
diff --git a/website/generated-content/blog/index.xml b/website/generated-content/blog/index.xml
index 89c39f1..bcab2a6 100644
--- a/website/generated-content/blog/index.xml
+++ b/website/generated-content/blog/index.xml
@@ -1,4 +1,192 @@
-<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Apache Beam – Blogs</title><link>/blog/</link><description>Recent content in Blogs on Apache Beam</description><generator>Hugo -- gohugo.io</generator><lastBuildDate>Fri, 21 Aug 2020 00:00:01 -0800</lastBuildDate><atom:link href="/blog/index.xml" rel="self" type="application/rss+xml"/><item><title>Blog: Improved Annotation Support for the Python SDK</title><link>/blog/python-improved-annotations/</link><pubDate>F [...]
+<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Apache Beam – Blogs</title><link>/blog/</link><description>Recent content in Blogs on Apache Beam</description><generator>Hugo -- gohugo.io</generator><lastBuildDate>Thu, 27 Aug 2020 00:00:01 +0800</lastBuildDate><atom:link href="/blog/index.xml" rel="self" type="application/rss+xml"/><item><title>Blog: Pattern Matching with Beam SQL</title><link>/blog/pattern-match-beam-sql/</link><pubDate>Thu, 27 Aug 2020 00:00 [...]
+&lt;!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+&lt;h2 id="introduction">Introduction&lt;/h2>
+&lt;p>SQL is becoming increasingly powerful and useful in the field of data analysis. MATCH_RECOGNIZE,
+a new SQL component introduced in 2016, brings extra analytical functionality. This project,
+as part of Google Summer of Code, aims to support basic MATCH_RECOGNIZE functionality. A basic MATCH_RECOGNIZE
+query would be something like this:
+&lt;div class=language-sql>
+&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-sql" data-lang="sql">&lt;span class="k">SELECT&lt;/span> &lt;span class="n">T&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">aid&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">T&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">bid&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">T&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">cid&lt;/span>
+&lt;span class="k">FROM&lt;/span> &lt;span class="n">MyTable&lt;/span>
+&lt;span class="n">MATCH_RECOGNIZE&lt;/span> &lt;span class="p">(&lt;/span>
+&lt;span class="n">PARTITION&lt;/span> &lt;span class="k">BY&lt;/span> &lt;span class="n">userid&lt;/span>
+&lt;span class="k">ORDER&lt;/span> &lt;span class="k">BY&lt;/span> &lt;span class="n">proctime&lt;/span>
+&lt;span class="n">MEASURES&lt;/span>
+&lt;span class="n">A&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">id&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">aid&lt;/span>&lt;span class="p">,&lt;/span>
+&lt;span class="n">B&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">id&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">bid&lt;/span>&lt;span class="p">,&lt;/span>
+&lt;span class="k">C&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">id&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">cid&lt;/span>
+&lt;span class="n">PATTERN&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">A&lt;/span> &lt;span class="n">B&lt;/span> &lt;span class="k">C&lt;/span>&lt;span class="p">)&lt;/span>
+&lt;span class="n">DEFINE&lt;/span>
+&lt;span class="n">A&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">name&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s1">&amp;#39;a&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span>
+&lt;span class="n">B&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">name&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s1">&amp;#39;b&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span>
+&lt;span class="k">C&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">name&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s1">&amp;#39;c&amp;#39;&lt;/span>
+&lt;span class="p">)&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">T&lt;/span>&lt;/code>&lt;/pre>&lt;/div>
+&lt;/div>
+&lt;/p>
+&lt;p>The above query finds out ordered sets of events that have names &amp;lsquo;a&amp;rsquo;, &amp;lsquo;b&amp;rsquo; and &amp;lsquo;c&amp;rsquo;. Apart from this basic usage of
+MATCH_RECOGNIZE, I supported a few of other crucial features such as quantifiers and row pattern navigation. I will spell out
+the details in later sections.&lt;/p>
+&lt;h2 id="approach--discussion">Approach &amp;amp; Discussion&lt;/h2>
+&lt;p>The implementation is strongly based on BEAM core transforms. Specifically, one MATCH_RECOGNIZE execution composes the
+following series of transforms:&lt;/p>
+&lt;ol>
+&lt;li>A &lt;code>ParDo&lt;/code> transform and then a &lt;code>GroupByKey&lt;/code> transform that build up the partitions (PARTITION BY).&lt;/li>
+&lt;li>A &lt;code>ParDo&lt;/code> transform that sorts within each partition (ORDER BY).&lt;/li>
+&lt;li>A &lt;code>ParDo&lt;/code> transform that applies pattern-match in each sorted partition.&lt;/li>
+&lt;/ol>
+&lt;p>A pattern-match operation was first done with the java regex library. That is, I first transform rows within a partition into
+a string and then apply regex pattern-match routines. If a row satisfies a condition, then I output the corresponding pattern variable.
+This is ok under the assumption that the pattern definitions are mutually exclusive. That is, a pattern definition like &lt;code>A AS A.price &amp;gt; 0, B AS b.price &amp;lt; 0&lt;/code> is allowed while
+a pattern definition like &lt;code>A AS A.price &amp;gt; 0, B AS B.proctime &amp;gt; 0&lt;/code> might results in an incomplete match. For the latter case,
+an event can satisfy the conditions A and B at the same time. Mutually exclusive conditions gives deterministic pattern-match:
+each event can only belong to at most one pattern class.&lt;/p>
+&lt;p>As specified in the SQL 2016 document, MATCH_RECOGNIZE defines a richer set of expression than regular expression. Specifically,
+it introduces &lt;em>Row Pattern Navigation Operations&lt;/em> such as &lt;code>PREV&lt;/code> and &lt;code>NEXT&lt;/code>. This is perhaps one of the most intriguing feature of
+MATCH_RECOGNIZE. A regex library would no longer suffice the need since the pattern definition could be back-referencing (&lt;code>PREV&lt;/code>) or
+forward-referencing (&lt;code>NEXT&lt;/code>). So for the second version of implementation, we chose to use an NFA regex engine. An NFA brings more flexibility
+in terms of non-determinism (see Chapter 6 of SQL 2016 Part 5 for a more thorough discussion). My proposed NFA is based on a paper of UMASS.&lt;/p>
+&lt;p>This is a working project. Many of the components are still not supported. I will list some unimplemented work in the section
+of future work.&lt;/p>
+&lt;h2 id="usages">Usages&lt;/h2>
+&lt;p>For now, the components I supported are:&lt;/p>
+&lt;ul>
+&lt;li>PARTITION BY&lt;/li>
+&lt;li>ORDER BY&lt;/li>
+&lt;li>MEASURES
+&lt;ol>
+&lt;li>LAST&lt;/li>
+&lt;li>FIRST&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;li>ONE ROW PER MATCH/ALL ROWS PER MATCH&lt;/li>
+&lt;li>DEFINE
+&lt;ol>
+&lt;li>Left side of the condition
+&lt;ol>
+&lt;li>LAST&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;li>Right side of the condition
+&lt;ol>
+&lt;li>PREV&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;li>Quantifier
+&lt;ol>
+&lt;li>Kleene plus&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;/ul>
+&lt;p>The pattern definition evaluation is hard coded. To be more specific, it expects the column reference of the incoming row
+to be on the left side of a comparator. Additionally, PREV function can only appear on the right side of the comparator.&lt;/p>
+&lt;p>With these limited tools, we could already write some slightly more complicated queries. Imagine we have the following
+table:&lt;/p>
+&lt;table>
+&lt;thead>
+&lt;tr>
+&lt;th align="center">transTime&lt;/th>
+&lt;th align="center">price&lt;/th>
+&lt;/tr>
+&lt;/thead>
+&lt;tbody>
+&lt;tr>
+&lt;td align="center">1&lt;/td>
+&lt;td align="center">3&lt;/td>
+&lt;/tr>
+&lt;tr>
+&lt;td align="center">2&lt;/td>
+&lt;td align="center">2&lt;/td>
+&lt;/tr>
+&lt;tr>
+&lt;td align="center">3&lt;/td>
+&lt;td align="center">1&lt;/td>
+&lt;/tr>
+&lt;tr>
+&lt;td align="center">4&lt;/td>
+&lt;td align="center">5&lt;/td>
+&lt;/tr>
+&lt;tr>
+&lt;td align="center">5&lt;/td>
+&lt;td align="center">6&lt;/td>
+&lt;/tr>
+&lt;/tbody>
+&lt;/table>
+&lt;p>This table reflects the price changes of a product with respect to the transaction time. We could write the following
+query:&lt;/p>
+&lt;div class=language-sql>
+&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-sql" data-lang="sql">&lt;span class="k">SELECT&lt;/span> &lt;span class="o">*&lt;/span>
+&lt;span class="k">FROM&lt;/span> &lt;span class="n">MyTable&lt;/span>
+&lt;span class="n">MATCH_RECOGNIZE&lt;/span> &lt;span class="p">(&lt;/span>
+&lt;span class="k">ORDER&lt;/span> &lt;span class="k">BY&lt;/span> &lt;span class="n">transTime&lt;/span>
+&lt;span class="n">MEASURES&lt;/span>
+&lt;span class="k">LAST&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">A&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">price&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">beforePrice&lt;/span>&lt;span class="p">,&lt;/span>
+&lt;span class="k">FIRST&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">B&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">price&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">afterPrice&lt;/span>
+&lt;span class="n">PATTERN&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">A&lt;/span>&lt;span class="o">+&lt;/span> &lt;span class="n">B&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="p">)&lt;/span>
+&lt;span class="n">DEFINE&lt;/span>
+&lt;span class="n">A&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">price&lt;/span> &lt;span class="o">&amp;lt;&lt;/span> &lt;span class="n">PREV&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">A&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">price&lt;/span>&lt;span class="p">),&lt;/span>
+&lt;span class="n">B&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">price&lt;/span> &lt;span class="o">&amp;gt;&lt;/span> &lt;span class="n">PREV&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">B&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">price&lt;/span>&lt;span class="p">)&lt;/span>
+&lt;span class="p">)&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">T&lt;/span>&lt;/code>&lt;/pre>&lt;/div>
+&lt;/div>
+&lt;p>This will find the local minimum price and the price after it. For the example dataset, the first 3 rows will be
+mapped to A and the rest of the rows will be mapped to B. Thus, we will have (1, 5) as the result.&lt;/p>
+&lt;blockquote>
+&lt;p>Very important: For my NFA implementation, it slightly breaks the rule in the SQL standard. Since the buffered NFA
+only stores an event to the buffer if the event is a match to some pattern class, There would be no way to get the
+previous event back if the previous row is discarded. So the first row would always be a match (different from the standard)
+if PREV is used.&lt;/p>
+&lt;/blockquote>
+&lt;h2 id="progress">Progress&lt;/h2>
+&lt;ol>
+&lt;li>PRs
+&lt;ol>
+&lt;li>&lt;a href="https://github.com/apache/beam/pull/12232">Support MATCH_RECOGNIZE using regex library&lt;/a> (merged)&lt;/li>
+&lt;li>&lt;a href="https://github.com/apache/beam/pull/12532">Support MATCH_RECOGNIZE using NFA&lt;/a> (pending)&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;li>Commits
+&lt;ol>
+&lt;li>partition by: &lt;a href="https://github.com/apache/beam/pull/12232/commits/064ada7257970bcb1d35530be1b88cb3830f242b">commit 064ada7&lt;/a>&lt;/li>
+&lt;li>order by: &lt;a href="https://github.com/apache/beam/pull/12232/commits/9cd1a82bec7b2f7c44aacfbd72f5f775bb58b650">commit 9cd1a82&lt;/a>&lt;/li>
+&lt;li>regex pattern match: &lt;a href="https://github.com/apache/beam/pull/12232/commits/8d6ffcc213e30999fc495c119b68da4f62fad258">commit 8d6ffcc&lt;/a>&lt;/li>
+&lt;li>support quantifiers: &lt;a href="https://github.com/apache/beam/pull/12232/commits/f529b876a2c2e43d012c71b3a83ebd55eb16f4ff">commit f529b87&lt;/a>&lt;/li>
+&lt;li>measures: &lt;a href="https://github.com/apache/beam/pull/12232/commits/87935746647611aa139d664ebed10c8e638bb024">commit 8793574&lt;/a>&lt;/li>
+&lt;li>added NFA implementation: &lt;a href="https://github.com/apache/beam/pull/12532/commits/fc731f2b0699d11853e7b76da86456427d434a2a">commit fc731f2&lt;/a>&lt;/li>
+&lt;li>implemented functions PREV and LAST: &lt;a href="https://github.com/apache/beam/pull/12532/commits/fc731f2b0699d11853e7b76da86456427d434a2a">commit 35323da&lt;/a>&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;/ol>
+&lt;h2 id="future-work">Future Work&lt;/h2>
+&lt;ul>
+&lt;li>Support FINAL/RUNNING keywords.&lt;/li>
+&lt;li>Support more quantifiers.&lt;/li>
+&lt;li>Add optimization to the NFA.&lt;/li>
+&lt;li>A better way to realize MATCH_RECOGNIZE might be having a Complex Event Processing library at BEAM core (rather than using BEAM transforms).&lt;/li>
+&lt;/ul>
+&lt;!-- Related Documents:
+- proposal
+- design doc
+- SQL 2016 standard
+- UMASS NFA^b paper
+-->
+&lt;h2 id="references">References&lt;/h2>
+&lt;ul>
+&lt;li>&lt;a href="https://drive.google.com/file/d/1ZuFZV4dCFVPZW_-RiqbU0w-vShaZh_jX/view?usp=sharing">Project Proposal&lt;/a>&lt;/li>
+&lt;li>&lt;a href="https://s.apache.org/beam-sql-pattern-recognization">Design Documentation&lt;/a>&lt;/li>
+&lt;li>&lt;a href="https://www.iso.org/standard/65143.html">SQL 2016 documentation Part 5&lt;/a>&lt;/li>
+&lt;li>&lt;a href="https://dl.acm.org/doi/10.1145/1376616.1376634">UMASS paper on NFA with shared buffer&lt;/a>&lt;/li>
+&lt;/ul></description></item><item><title>Blog: Improved Annotation Support for the Python SDK</title><link>/blog/python-improved-annotations/</link><pubDate>Fri, 21 Aug 2020 00:00:01 -0800</pubDate><guid>/blog/python-improved-annotations/</guid><description>
 &lt;!--
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
@@ -5157,27 +5345,4 @@ anticipated, perhaps one every 1-2 months.&lt;/p>
 the developer experience will be our focus for the next several months. If you
 have any comments or discover any issues, I’d like to invite you to reach out
 to us via &lt;a href="/get-started/support/">user’s mailing list&lt;/a> or the
-&lt;a href="https://issues.apache.org/jira/browse/BEAM/">Apache JIRA issue tracker&lt;/a>.&lt;/p></description></item><item><title>Blog: How We Added Windowing to the Apache Flink Batch Runner</title><link>/blog/flink-batch-runner-milestone/</link><pubDate>Mon, 13 Jun 2016 09:00:00 -0700</pubDate><guid>/blog/flink-batch-runner-milestone/</guid><description>
-&lt;!--
-Licensed under the Apache License, Version 2.0 (the "License");
-you may not use this file except in compliance with the License.
-You may obtain a copy of the License at
-http://www.apache.org/licenses/LICENSE-2.0
-Unless required by applicable law or agreed to in writing, software
-distributed under the License is distributed on an "AS IS" BASIS,
-WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-See the License for the specific language governing permissions and
-limitations under the License.
--->
-&lt;p>We recently achieved a major milestone by adding support for windowing to the &lt;a href="https://flink.apache.org">Apache Flink&lt;/a> Batch runner. In this post we would like to explain what this means for users of Apache Beam and highlight some of the implementation details.&lt;/p>
-&lt;p>Before we start, though, let’s quickly talk about the execution of Beam programs and how this is relevant to today’s post. A Beam pipeline can contain bounded and unbounded sources. If the pipeline only contains bounded sources it can be executed in a batch fashion, if it contains some unbounded sources it must be executed in a streaming fashion. When executing a Beam pipeline on Flink, you don’t have to choose the execution mode. Internally, the Flink runner either translates the  [...]
-&lt;h2 id="what-does-this-mean-for-users">What does this mean for users?&lt;/h2>
-&lt;p>Support for windowing was the last missing puzzle piece for making the Flink Batch runner compatible with the Beam model. With the latest change to the Batch runner users can now run any pipeline that only contains bounded sources and be certain that the results match those of the original reference-implementation runners that were provided by Google as part of the initial code drop coming from the Google Dataflow SDK.&lt;/p>
-&lt;p>The most obvious part of the change is that windows can now be assigned to elements and that the runner respects these windows for the &lt;code>GroupByKey&lt;/code> and &lt;code>Combine&lt;/code> operations. A not-so-obvious change concerns side-inputs. In the Beam model, side inputs respect windows; when a value of the main input is being processed only the side input that corresponds to the correct window is available to the processing function, the &lt;code>DoFn&lt;/code>.&lt;/p>
-&lt;p>Getting side-input semantics right is an important milestone in it’s own because it allows to use a big suite of unit tests for verifying the correctness of a runner implementation. These tests exercise every obscure detail of the Beam programming model and verify that the results produced by a runner match what you would expect from a correct implementation. In the suite, side inputs are used to compare the expected result to the actual result. With these tests being executed regu [...]
-&lt;h2 id="under-the-hood">Under the Hood&lt;/h2>
-&lt;p>The basis for the changes is the introduction of &lt;code>WindowedValue&lt;/code> in the generated Flink transformations. Before, a Beam &lt;code>PCollection&amp;lt;T&amp;gt;&lt;/code> would be transformed to a &lt;code>DataSet&amp;lt;T&amp;gt;&lt;/code>. Now, we instead create a &lt;code>DataSet&amp;lt;WindowedValue&amp;lt;T&amp;gt;&amp;gt;&lt;/code>. The &lt;code>WindowedValue&amp;lt;T&amp;gt;&lt;/code> stores meta data about the value, such as the timestamp and the windows to wh [...]
-&lt;p>With this basic change out of the way we just had to make sure that windows were respected for side inputs and that &lt;code>Combine&lt;/code> and &lt;code>GroupByKey&lt;/code> correctly handled windows. The tricky part there is the handling of merging windows such as session windows. For these we essentially emulate the behavior of a merging &lt;code>WindowFn&lt;/code> in our own code.&lt;/p>
-&lt;p>After we got side inputs working we could enable the aforementioned suite of tests to check how well the runner behaves with respect to the Beam model. As can be expected there were quite some discrepancies but we managed to resolve them all. In the process, we also slimmed down the runner implementation. For example, we removed all custom translations for sources and sinks and are now relying only on Beam code for these, thereby greatly reducing the maintenance overhead.&lt;/p>
-&lt;h2 id="summary">Summary&lt;/h2>
-&lt;p>We reached a major milestone in adding windowing support to the Flink Batch runner, thereby making it compatible with the Beam model. Because of the large suite of tests that can now be executed on the runner we are also confident about the correctness of the implementation and about it staying that way in the future.&lt;/p></description></item></channel></rss>
\ No newline at end of file
+&lt;a href="https://issues.apache.org/jira/browse/BEAM/">Apache JIRA issue tracker&lt;/a>.&lt;/p></description></item></channel></rss>
\ No newline at end of file
diff --git a/website/generated-content/blog/pattern-match-beam-sql/index.html b/website/generated-content/blog/pattern-match-beam-sql/index.html
new file mode 100644
index 0000000..6e12b2f
--- /dev/null
+++ b/website/generated-content/blog/pattern-match-beam-sql/index.html
@@ -0,0 +1,56 @@
+<!doctype html><html lang=en class=no-js><head><meta charset=utf-8><meta http-equiv=x-ua-compatible content="IE=edge"><meta name=viewport content="width=device-width,initial-scale=1"><title>Pattern Matching with Beam SQL</title><meta name=description content="Apache Beam is an open source, unified model and set of language-specific SDKs for defining and executing data processing workflows, and also data ingestion and integration flows, supporting Enterprise Integration Patterns (EIPs) an [...]
+<span class=sr-only>Toggle navigation</span>
+<span class=icon-bar></span><span class=icon-bar></span><span class=icon-bar></span></button>
+<a href=/ class=navbar-brand><img alt=Brand style=height:25px src=/images/beam_logo_navbar.png></a></div><div class="navbar-mask closed"></div><div id=navbar class="navbar-container closed"><ul class="nav navbar-nav"><li><a href=/get-started/beam-overview/>Get Started</a></li><li><a href=/documentation/>Documentation</a></li><li><a href=/documentation/sdks/java/>Languages</a></li><li><a href=/documentation/runners/capability-matrix/>RUNNERS</a></li><li><a href=/roadmap/>Roadmap</a></li>< [...]
+•</p></header><div class=post-content itemprop=articleBody><h2 id=introduction>Introduction</h2><p>SQL is becoming increasingly powerful and useful in the field of data analysis. MATCH_RECOGNIZE,
+a new SQL component introduced in 2016, brings extra analytical functionality. This project,
+as part of Google Summer of Code, aims to support basic MATCH_RECOGNIZE functionality. A basic MATCH_RECOGNIZE
+query would be something like this:<div class=language-sql><div class=highlight><pre class=chroma><code class=language-sql data-lang=sql><span class=k>SELECT</span> <span class=n>T</span><span class=p>.</span><span class=n>aid</span><span class=p>,</span> <span class=n>T</span><span class=p>.</span><span class=n>bid</span><span class=p>,</span> <span class=n>T</span><span class=p>.</span><span class=n>cid</span>
+<span class=k>FROM</span> <span class=n>MyTable</span>
+    <span class=n>MATCH_RECOGNIZE</span> <span class=p>(</span>
+      <span class=n>PARTITION</span> <span class=k>BY</span> <span class=n>userid</span>
+      <span class=k>ORDER</span> <span class=k>BY</span> <span class=n>proctime</span>
+      <span class=n>MEASURES</span>
+        <span class=n>A</span><span class=p>.</span><span class=n>id</span> <span class=k>AS</span> <span class=n>aid</span><span class=p>,</span>
+        <span class=n>B</span><span class=p>.</span><span class=n>id</span> <span class=k>AS</span> <span class=n>bid</span><span class=p>,</span>
+        <span class=k>C</span><span class=p>.</span><span class=n>id</span> <span class=k>AS</span> <span class=n>cid</span>
+      <span class=n>PATTERN</span> <span class=p>(</span><span class=n>A</span> <span class=n>B</span> <span class=k>C</span><span class=p>)</span>
+      <span class=n>DEFINE</span>
+        <span class=n>A</span> <span class=k>AS</span> <span class=n>name</span> <span class=o>=</span> <span class=s1>&#39;a&#39;</span><span class=p>,</span>
+        <span class=n>B</span> <span class=k>AS</span> <span class=n>name</span> <span class=o>=</span> <span class=s1>&#39;b&#39;</span><span class=p>,</span>
+        <span class=k>C</span> <span class=k>AS</span> <span class=n>name</span> <span class=o>=</span> <span class=s1>&#39;c&#39;</span>
+    <span class=p>)</span> <span class=k>AS</span> <span class=n>T</span></code></pre></div></div></p><p>The above query finds out ordered sets of events that have names &lsquo;a&rsquo;, &lsquo;b&rsquo; and &lsquo;c&rsquo;. Apart from this basic usage of
+MATCH_RECOGNIZE, I supported a few of other crucial features such as quantifiers and row pattern navigation. I will spell out
+the details in later sections.</p><h2 id=approach--discussion>Approach & Discussion</h2><p>The implementation is strongly based on BEAM core transforms. Specifically, one MATCH_RECOGNIZE execution composes the
+following series of transforms:</p><ol><li>A <code>ParDo</code> transform and then a <code>GroupByKey</code> transform that build up the partitions (PARTITION BY).</li><li>A <code>ParDo</code> transform that sorts within each partition (ORDER BY).</li><li>A <code>ParDo</code> transform that applies pattern-match in each sorted partition.</li></ol><p>A pattern-match operation was first done with the java regex library. That is, I first transform rows within a partition into
+a string and then apply regex pattern-match routines. If a row satisfies a condition, then I output the corresponding pattern variable.
+This is ok under the assumption that the pattern definitions are mutually exclusive. That is, a pattern definition like <code>A AS A.price > 0, B AS b.price &lt; 0</code> is allowed while
+a pattern definition like <code>A AS A.price > 0, B AS B.proctime > 0</code> might results in an incomplete match. For the latter case,
+an event can satisfy the conditions A and B at the same time. Mutually exclusive conditions gives deterministic pattern-match:
+each event can only belong to at most one pattern class.</p><p>As specified in the SQL 2016 document, MATCH_RECOGNIZE defines a richer set of expression than regular expression. Specifically,
+it introduces <em>Row Pattern Navigation Operations</em> such as <code>PREV</code> and <code>NEXT</code>. This is perhaps one of the most intriguing feature of
+MATCH_RECOGNIZE. A regex library would no longer suffice the need since the pattern definition could be back-referencing (<code>PREV</code>) or
+forward-referencing (<code>NEXT</code>). So for the second version of implementation, we chose to use an NFA regex engine. An NFA brings more flexibility
+in terms of non-determinism (see Chapter 6 of SQL 2016 Part 5 for a more thorough discussion). My proposed NFA is based on a paper of UMASS.</p><p>This is a working project. Many of the components are still not supported. I will list some unimplemented work in the section
+of future work.</p><h2 id=usages>Usages</h2><p>For now, the components I supported are:</p><ul><li>PARTITION BY</li><li>ORDER BY</li><li>MEASURES<ol><li>LAST</li><li>FIRST</li></ol></li><li>ONE ROW PER MATCH/ALL ROWS PER MATCH</li><li>DEFINE<ol><li>Left side of the condition<ol><li>LAST</li></ol></li><li>Right side of the condition<ol><li>PREV</li></ol></li></ol></li><li>Quantifier<ol><li>Kleene plus</li></ol></li></ul><p>The pattern definition evaluation is hard coded. To be more specif [...]
+to be on the left side of a comparator. Additionally, PREV function can only appear on the right side of the comparator.</p><p>With these limited tools, we could already write some slightly more complicated queries. Imagine we have the following
+table:</p><table><thead><tr><th align=center>transTime</th><th align=center>price</th></tr></thead><tbody><tr><td align=center>1</td><td align=center>3</td></tr><tr><td align=center>2</td><td align=center>2</td></tr><tr><td align=center>3</td><td align=center>1</td></tr><tr><td align=center>4</td><td align=center>5</td></tr><tr><td align=center>5</td><td align=center>6</td></tr></tbody></table><p>This table reflects the price changes of a product with respect to the transaction time. We  [...]
+query:</p><div class=language-sql><div class=highlight><pre class=chroma><code class=language-sql data-lang=sql><span class=k>SELECT</span> <span class=o>*</span>
+<span class=k>FROM</span> <span class=n>MyTable</span>
+    <span class=n>MATCH_RECOGNIZE</span> <span class=p>(</span>
+      <span class=k>ORDER</span> <span class=k>BY</span> <span class=n>transTime</span>
+      <span class=n>MEASURES</span>
+        <span class=k>LAST</span><span class=p>(</span><span class=n>A</span><span class=p>.</span><span class=n>price</span><span class=p>)</span> <span class=k>AS</span> <span class=n>beforePrice</span><span class=p>,</span>
+        <span class=k>FIRST</span><span class=p>(</span><span class=n>B</span><span class=p>.</span><span class=n>price</span><span class=p>)</span> <span class=k>AS</span> <span class=n>afterPrice</span>
+      <span class=n>PATTERN</span> <span class=p>(</span><span class=n>A</span><span class=o>+</span> <span class=n>B</span><span class=o>+</span><span class=p>)</span>
+      <span class=n>DEFINE</span>
+        <span class=n>A</span> <span class=k>AS</span> <span class=n>price</span> <span class=o>&lt;</span> <span class=n>PREV</span><span class=p>(</span><span class=n>A</span><span class=p>.</span><span class=n>price</span><span class=p>),</span>
+        <span class=n>B</span> <span class=k>AS</span> <span class=n>price</span> <span class=o>&gt;</span> <span class=n>PREV</span><span class=p>(</span><span class=n>B</span><span class=p>.</span><span class=n>price</span><span class=p>)</span>
+    <span class=p>)</span> <span class=k>AS</span> <span class=n>T</span></code></pre></div></div><p>This will find the local minimum price and the price after it. For the example dataset, the first 3 rows will be
+mapped to A and the rest of the rows will be mapped to B. Thus, we will have (1, 5) as the result.</p><blockquote><p>Very important: For my NFA implementation, it slightly breaks the rule in the SQL standard. Since the buffered NFA
+only stores an event to the buffer if the event is a match to some pattern class, There would be no way to get the
+previous event back if the previous row is discarded. So the first row would always be a match (different from the standard)
+if PREV is used.</p></blockquote><h2 id=progress>Progress</h2><ol><li>PRs<ol><li><a href=https://github.com/apache/beam/pull/12232>Support MATCH_RECOGNIZE using regex library</a> (merged)</li><li><a href=https://github.com/apache/beam/pull/12532>Support MATCH_RECOGNIZE using NFA</a> (pending)</li></ol></li><li>Commits<ol><li>partition by: <a href=https://github.com/apache/beam/pull/12232/commits/064ada7257970bcb1d35530be1b88cb3830f242b>commit 064ada7</a></li><li>order by: <a href=https:/ [...]
+<a href=http://www.apache.org>The Apache Software Foundation</a>
+| <a href=/privacy_policy>Privacy Policy</a>
+| <a href=/feed.xml>RSS Feed</a><br><br>Apache Beam, Apache, Beam, the Beam logo, and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation. All other products or name brands are trademarks of their respective holders, including The Apache Software Foundation.</div></footer></body></html>
\ No newline at end of file
diff --git a/website/generated-content/categories/blog/index.xml b/website/generated-content/categories/blog/index.xml
index b424dd2..f615297 100644
--- a/website/generated-content/categories/blog/index.xml
+++ b/website/generated-content/categories/blog/index.xml
@@ -1,4 +1,192 @@
-<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Apache Beam – blog</title><link>/categories/blog/</link><description>Recent content in blog on Apache Beam</description><generator>Hugo -- gohugo.io</generator><lastBuildDate>Fri, 21 Aug 2020 00:00:01 -0800</lastBuildDate><atom:link href="/categories/blog/index.xml" rel="self" type="application/rss+xml"/><item><title>Blog: Improved Annotation Support for the Python SDK</title><link>/blog/python-improved-annotatio [...]
+<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Apache Beam – blog</title><link>/categories/blog/</link><description>Recent content in blog on Apache Beam</description><generator>Hugo -- gohugo.io</generator><lastBuildDate>Thu, 27 Aug 2020 00:00:01 +0800</lastBuildDate><atom:link href="/categories/blog/index.xml" rel="self" type="application/rss+xml"/><item><title>Blog: Pattern Matching with Beam SQL</title><link>/blog/pattern-match-beam-sql/</link><pubDate>Th [...]
+&lt;!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+&lt;h2 id="introduction">Introduction&lt;/h2>
+&lt;p>SQL is becoming increasingly powerful and useful in the field of data analysis. MATCH_RECOGNIZE,
+a new SQL component introduced in 2016, brings extra analytical functionality. This project,
+as part of Google Summer of Code, aims to support basic MATCH_RECOGNIZE functionality. A basic MATCH_RECOGNIZE
+query would be something like this:
+&lt;div class=language-sql>
+&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-sql" data-lang="sql">&lt;span class="k">SELECT&lt;/span> &lt;span class="n">T&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">aid&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">T&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">bid&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">T&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">cid&lt;/span>
+&lt;span class="k">FROM&lt;/span> &lt;span class="n">MyTable&lt;/span>
+&lt;span class="n">MATCH_RECOGNIZE&lt;/span> &lt;span class="p">(&lt;/span>
+&lt;span class="n">PARTITION&lt;/span> &lt;span class="k">BY&lt;/span> &lt;span class="n">userid&lt;/span>
+&lt;span class="k">ORDER&lt;/span> &lt;span class="k">BY&lt;/span> &lt;span class="n">proctime&lt;/span>
+&lt;span class="n">MEASURES&lt;/span>
+&lt;span class="n">A&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">id&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">aid&lt;/span>&lt;span class="p">,&lt;/span>
+&lt;span class="n">B&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">id&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">bid&lt;/span>&lt;span class="p">,&lt;/span>
+&lt;span class="k">C&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">id&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">cid&lt;/span>
+&lt;span class="n">PATTERN&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">A&lt;/span> &lt;span class="n">B&lt;/span> &lt;span class="k">C&lt;/span>&lt;span class="p">)&lt;/span>
+&lt;span class="n">DEFINE&lt;/span>
+&lt;span class="n">A&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">name&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s1">&amp;#39;a&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span>
+&lt;span class="n">B&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">name&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s1">&amp;#39;b&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span>
+&lt;span class="k">C&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">name&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s1">&amp;#39;c&amp;#39;&lt;/span>
+&lt;span class="p">)&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">T&lt;/span>&lt;/code>&lt;/pre>&lt;/div>
+&lt;/div>
+&lt;/p>
+&lt;p>The above query finds out ordered sets of events that have names &amp;lsquo;a&amp;rsquo;, &amp;lsquo;b&amp;rsquo; and &amp;lsquo;c&amp;rsquo;. Apart from this basic usage of
+MATCH_RECOGNIZE, I supported a few of other crucial features such as quantifiers and row pattern navigation. I will spell out
+the details in later sections.&lt;/p>
+&lt;h2 id="approach--discussion">Approach &amp;amp; Discussion&lt;/h2>
+&lt;p>The implementation is strongly based on BEAM core transforms. Specifically, one MATCH_RECOGNIZE execution composes the
+following series of transforms:&lt;/p>
+&lt;ol>
+&lt;li>A &lt;code>ParDo&lt;/code> transform and then a &lt;code>GroupByKey&lt;/code> transform that build up the partitions (PARTITION BY).&lt;/li>
+&lt;li>A &lt;code>ParDo&lt;/code> transform that sorts within each partition (ORDER BY).&lt;/li>
+&lt;li>A &lt;code>ParDo&lt;/code> transform that applies pattern-match in each sorted partition.&lt;/li>
+&lt;/ol>
+&lt;p>A pattern-match operation was first done with the java regex library. That is, I first transform rows within a partition into
+a string and then apply regex pattern-match routines. If a row satisfies a condition, then I output the corresponding pattern variable.
+This is ok under the assumption that the pattern definitions are mutually exclusive. That is, a pattern definition like &lt;code>A AS A.price &amp;gt; 0, B AS b.price &amp;lt; 0&lt;/code> is allowed while
+a pattern definition like &lt;code>A AS A.price &amp;gt; 0, B AS B.proctime &amp;gt; 0&lt;/code> might results in an incomplete match. For the latter case,
+an event can satisfy the conditions A and B at the same time. Mutually exclusive conditions gives deterministic pattern-match:
+each event can only belong to at most one pattern class.&lt;/p>
+&lt;p>As specified in the SQL 2016 document, MATCH_RECOGNIZE defines a richer set of expression than regular expression. Specifically,
+it introduces &lt;em>Row Pattern Navigation Operations&lt;/em> such as &lt;code>PREV&lt;/code> and &lt;code>NEXT&lt;/code>. This is perhaps one of the most intriguing feature of
+MATCH_RECOGNIZE. A regex library would no longer suffice the need since the pattern definition could be back-referencing (&lt;code>PREV&lt;/code>) or
+forward-referencing (&lt;code>NEXT&lt;/code>). So for the second version of implementation, we chose to use an NFA regex engine. An NFA brings more flexibility
+in terms of non-determinism (see Chapter 6 of SQL 2016 Part 5 for a more thorough discussion). My proposed NFA is based on a paper of UMASS.&lt;/p>
+&lt;p>This is a working project. Many of the components are still not supported. I will list some unimplemented work in the section
+of future work.&lt;/p>
+&lt;h2 id="usages">Usages&lt;/h2>
+&lt;p>For now, the components I supported are:&lt;/p>
+&lt;ul>
+&lt;li>PARTITION BY&lt;/li>
+&lt;li>ORDER BY&lt;/li>
+&lt;li>MEASURES
+&lt;ol>
+&lt;li>LAST&lt;/li>
+&lt;li>FIRST&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;li>ONE ROW PER MATCH/ALL ROWS PER MATCH&lt;/li>
+&lt;li>DEFINE
+&lt;ol>
+&lt;li>Left side of the condition
+&lt;ol>
+&lt;li>LAST&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;li>Right side of the condition
+&lt;ol>
+&lt;li>PREV&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;li>Quantifier
+&lt;ol>
+&lt;li>Kleene plus&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;/ul>
+&lt;p>The pattern definition evaluation is hard coded. To be more specific, it expects the column reference of the incoming row
+to be on the left side of a comparator. Additionally, PREV function can only appear on the right side of the comparator.&lt;/p>
+&lt;p>With these limited tools, we could already write some slightly more complicated queries. Imagine we have the following
+table:&lt;/p>
+&lt;table>
+&lt;thead>
+&lt;tr>
+&lt;th align="center">transTime&lt;/th>
+&lt;th align="center">price&lt;/th>
+&lt;/tr>
+&lt;/thead>
+&lt;tbody>
+&lt;tr>
+&lt;td align="center">1&lt;/td>
+&lt;td align="center">3&lt;/td>
+&lt;/tr>
+&lt;tr>
+&lt;td align="center">2&lt;/td>
+&lt;td align="center">2&lt;/td>
+&lt;/tr>
+&lt;tr>
+&lt;td align="center">3&lt;/td>
+&lt;td align="center">1&lt;/td>
+&lt;/tr>
+&lt;tr>
+&lt;td align="center">4&lt;/td>
+&lt;td align="center">5&lt;/td>
+&lt;/tr>
+&lt;tr>
+&lt;td align="center">5&lt;/td>
+&lt;td align="center">6&lt;/td>
+&lt;/tr>
+&lt;/tbody>
+&lt;/table>
+&lt;p>This table reflects the price changes of a product with respect to the transaction time. We could write the following
+query:&lt;/p>
+&lt;div class=language-sql>
+&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-sql" data-lang="sql">&lt;span class="k">SELECT&lt;/span> &lt;span class="o">*&lt;/span>
+&lt;span class="k">FROM&lt;/span> &lt;span class="n">MyTable&lt;/span>
+&lt;span class="n">MATCH_RECOGNIZE&lt;/span> &lt;span class="p">(&lt;/span>
+&lt;span class="k">ORDER&lt;/span> &lt;span class="k">BY&lt;/span> &lt;span class="n">transTime&lt;/span>
+&lt;span class="n">MEASURES&lt;/span>
+&lt;span class="k">LAST&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">A&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">price&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">beforePrice&lt;/span>&lt;span class="p">,&lt;/span>
+&lt;span class="k">FIRST&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">B&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">price&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">afterPrice&lt;/span>
+&lt;span class="n">PATTERN&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">A&lt;/span>&lt;span class="o">+&lt;/span> &lt;span class="n">B&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="p">)&lt;/span>
+&lt;span class="n">DEFINE&lt;/span>
+&lt;span class="n">A&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">price&lt;/span> &lt;span class="o">&amp;lt;&lt;/span> &lt;span class="n">PREV&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">A&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">price&lt;/span>&lt;span class="p">),&lt;/span>
+&lt;span class="n">B&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">price&lt;/span> &lt;span class="o">&amp;gt;&lt;/span> &lt;span class="n">PREV&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">B&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">price&lt;/span>&lt;span class="p">)&lt;/span>
+&lt;span class="p">)&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">T&lt;/span>&lt;/code>&lt;/pre>&lt;/div>
+&lt;/div>
+&lt;p>This will find the local minimum price and the price after it. For the example dataset, the first 3 rows will be
+mapped to A and the rest of the rows will be mapped to B. Thus, we will have (1, 5) as the result.&lt;/p>
+&lt;blockquote>
+&lt;p>Very important: For my NFA implementation, it slightly breaks the rule in the SQL standard. Since the buffered NFA
+only stores an event to the buffer if the event is a match to some pattern class, There would be no way to get the
+previous event back if the previous row is discarded. So the first row would always be a match (different from the standard)
+if PREV is used.&lt;/p>
+&lt;/blockquote>
+&lt;h2 id="progress">Progress&lt;/h2>
+&lt;ol>
+&lt;li>PRs
+&lt;ol>
+&lt;li>&lt;a href="https://github.com/apache/beam/pull/12232">Support MATCH_RECOGNIZE using regex library&lt;/a> (merged)&lt;/li>
+&lt;li>&lt;a href="https://github.com/apache/beam/pull/12532">Support MATCH_RECOGNIZE using NFA&lt;/a> (pending)&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;li>Commits
+&lt;ol>
+&lt;li>partition by: &lt;a href="https://github.com/apache/beam/pull/12232/commits/064ada7257970bcb1d35530be1b88cb3830f242b">commit 064ada7&lt;/a>&lt;/li>
+&lt;li>order by: &lt;a href="https://github.com/apache/beam/pull/12232/commits/9cd1a82bec7b2f7c44aacfbd72f5f775bb58b650">commit 9cd1a82&lt;/a>&lt;/li>
+&lt;li>regex pattern match: &lt;a href="https://github.com/apache/beam/pull/12232/commits/8d6ffcc213e30999fc495c119b68da4f62fad258">commit 8d6ffcc&lt;/a>&lt;/li>
+&lt;li>support quantifiers: &lt;a href="https://github.com/apache/beam/pull/12232/commits/f529b876a2c2e43d012c71b3a83ebd55eb16f4ff">commit f529b87&lt;/a>&lt;/li>
+&lt;li>measures: &lt;a href="https://github.com/apache/beam/pull/12232/commits/87935746647611aa139d664ebed10c8e638bb024">commit 8793574&lt;/a>&lt;/li>
+&lt;li>added NFA implementation: &lt;a href="https://github.com/apache/beam/pull/12532/commits/fc731f2b0699d11853e7b76da86456427d434a2a">commit fc731f2&lt;/a>&lt;/li>
+&lt;li>implemented functions PREV and LAST: &lt;a href="https://github.com/apache/beam/pull/12532/commits/fc731f2b0699d11853e7b76da86456427d434a2a">commit 35323da&lt;/a>&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;/ol>
+&lt;h2 id="future-work">Future Work&lt;/h2>
+&lt;ul>
+&lt;li>Support FINAL/RUNNING keywords.&lt;/li>
+&lt;li>Support more quantifiers.&lt;/li>
+&lt;li>Add optimization to the NFA.&lt;/li>
+&lt;li>A better way to realize MATCH_RECOGNIZE might be having a Complex Event Processing library at BEAM core (rather than using BEAM transforms).&lt;/li>
+&lt;/ul>
+&lt;!-- Related Documents:
+- proposal
+- design doc
+- SQL 2016 standard
+- UMASS NFA^b paper
+-->
+&lt;h2 id="references">References&lt;/h2>
+&lt;ul>
+&lt;li>&lt;a href="https://drive.google.com/file/d/1ZuFZV4dCFVPZW_-RiqbU0w-vShaZh_jX/view?usp=sharing">Project Proposal&lt;/a>&lt;/li>
+&lt;li>&lt;a href="https://s.apache.org/beam-sql-pattern-recognization">Design Documentation&lt;/a>&lt;/li>
+&lt;li>&lt;a href="https://www.iso.org/standard/65143.html">SQL 2016 documentation Part 5&lt;/a>&lt;/li>
+&lt;li>&lt;a href="https://dl.acm.org/doi/10.1145/1376616.1376634">UMASS paper on NFA with shared buffer&lt;/a>&lt;/li>
+&lt;/ul></description></item><item><title>Blog: Improved Annotation Support for the Python SDK</title><link>/blog/python-improved-annotations/</link><pubDate>Fri, 21 Aug 2020 00:00:01 -0800</pubDate><guid>/blog/python-improved-annotations/</guid><description>
 &lt;!--
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
@@ -5200,18 +5388,4 @@ PCollection&amp;lt;O&amp;gt; output = input
 &lt;i>Three different visualizations of a simple WordCount pipeline which computes the number of occurrences of every word in a set of text files. The flat view gives the full DAG of all operations performed. The execution view groups operations according to how they're executed, e.g. after performing runner-specific optimizations like function composition. The structured view nests operations according to their grouping in PTransforms.&lt;/i>
 &lt;/div>
 &lt;h2 id="summary">Summary&lt;/h2>
-&lt;p>Although it&amp;rsquo;s tempting to add methods to PCollections, such an approach is not scalable, extensible, or sufficiently expressive. Putting a single apply method on PCollection and all the logic into the operation itself lets us have the best of both worlds, and avoids hard cliffs of complexity by having a single consistent style across simple and complex pipelines, and between predefined and user-defined operations.&lt;/p></description></item><item><title>Blog: Dynamic work [...]
-&lt;!--
-Licensed under the Apache License, Version 2.0 (the "License");
-you may not use this file except in compliance with the License.
-You may obtain a copy of the License at
-http://www.apache.org/licenses/LICENSE-2.0
-Unless required by applicable law or agreed to in writing, software
-distributed under the License is distributed on an "AS IS" BASIS,
-WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-See the License for the specific language governing permissions and
-limitations under the License.
--->
-&lt;p>This morning, Eugene and Malo from the Google Cloud Dataflow team posted &lt;a href="https://cloud.google.com/blog/big-data/2016/05/no-shard-left-behind-dynamic-work-rebalancing-in-google-cloud-dataflow">&lt;em>No shard left behind: dynamic work rebalancing in Google Cloud Dataflow&lt;/em>&lt;/a>. This article discusses Cloud Dataflow’s solution to the well-known straggler problem.&lt;/p>
-&lt;p>In a large batch processing job with many tasks executing in parallel, some of the tasks &amp;ndash; the stragglers &amp;ndash; can take a much longer time to complete than others, perhaps due to imperfect splitting of the work into parallel chunks when issuing the job. Typically, waiting for stragglers means that the overall job completes later than it should, and may also reserve too many machines that may be underutilized at the end. Cloud Dataflow’s dynamic work rebalancing can [...]
-&lt;p>What I’d like to highlight for the Apache Beam (incubating) community is that Cloud Dataflow’s dynamic work rebalancing is implemented using &lt;em>runner-specific&lt;/em> control logic on top of Beam’s &lt;em>runner-independent&lt;/em> &lt;a href="https://github.com/apache/beam/blob/9fa97fb2491bc784df53fb0f044409dbbc2af3d7/sdks/java/core/src/main/java/org/apache/beam/sdk/io/BoundedSource.java">&lt;code>BoundedSource API&lt;/code>&lt;/a>. Specifically, to steal work from a straggle [...]
\ No newline at end of file
+&lt;p>Although it&amp;rsquo;s tempting to add methods to PCollections, such an approach is not scalable, extensible, or sufficiently expressive. Putting a single apply method on PCollection and all the logic into the operation itself lets us have the best of both worlds, and avoids hard cliffs of complexity by having a single consistent style across simple and complex pipelines, and between predefined and user-defined operations.&lt;/p></description></item></channel></rss>
\ No newline at end of file
diff --git a/website/generated-content/categories/index.xml b/website/generated-content/categories/index.xml
index b9555ba..5ac594b 100644
--- a/website/generated-content/categories/index.xml
+++ b/website/generated-content/categories/index.xml
@@ -1 +1 @@
-<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Apache Beam – Categories</title><link>/categories/</link><description>Recent content in Categories on Apache Beam</description><generator>Hugo -- gohugo.io</generator><lastBuildDate>Fri, 21 Aug 2020 00:00:01 -0800</lastBuildDate><atom:link href="/categories/index.xml" rel="self" type="application/rss+xml"/></channel></rss>
\ No newline at end of file
+<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Apache Beam – Categories</title><link>/categories/</link><description>Recent content in Categories on Apache Beam</description><generator>Hugo -- gohugo.io</generator><lastBuildDate>Thu, 27 Aug 2020 00:00:01 +0800</lastBuildDate><atom:link href="/categories/index.xml" rel="self" type="application/rss+xml"/></channel></rss>
\ No newline at end of file
diff --git a/website/generated-content/feed.xml b/website/generated-content/feed.xml
index 18053c6..6125c7b 100644
--- a/website/generated-content/feed.xml
+++ b/website/generated-content/feed.xml
@@ -1,4 +1,192 @@
-<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title>Apache Beam</title><description>Apache Beam is an open source, unified model and set of language-specific SDKs for defining and executing data processing workflows, and also data ingestion and integration flows, supporting Enterprise Integration Patterns (EIPs) and Domain Specific Languages (DSLs). Dataflow pipelines simplify the mechanics of large-scale batch and streaming data processing and can run on a number [...]
+<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title>Apache Beam</title><description>Apache Beam is an open source, unified model and set of language-specific SDKs for defining and executing data processing workflows, and also data ingestion and integration flows, supporting Enterprise Integration Patterns (EIPs) and Domain Specific Languages (DSLs). Dataflow pipelines simplify the mechanics of large-scale batch and streaming data processing and can run on a number [...]
+&lt;!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+&lt;h2 id="introduction">Introduction&lt;/h2>
+&lt;p>SQL is becoming increasingly powerful and useful in the field of data analysis. MATCH_RECOGNIZE,
+a new SQL component introduced in 2016, brings extra analytical functionality. This project,
+as part of Google Summer of Code, aims to support basic MATCH_RECOGNIZE functionality. A basic MATCH_RECOGNIZE
+query would be something like this:
+&lt;div class=language-sql>
+&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-sql" data-lang="sql">&lt;span class="k">SELECT&lt;/span> &lt;span class="n">T&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">aid&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">T&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">bid&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">T&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">cid&lt;/span>
+&lt;span class="k">FROM&lt;/span> &lt;span class="n">MyTable&lt;/span>
+&lt;span class="n">MATCH_RECOGNIZE&lt;/span> &lt;span class="p">(&lt;/span>
+&lt;span class="n">PARTITION&lt;/span> &lt;span class="k">BY&lt;/span> &lt;span class="n">userid&lt;/span>
+&lt;span class="k">ORDER&lt;/span> &lt;span class="k">BY&lt;/span> &lt;span class="n">proctime&lt;/span>
+&lt;span class="n">MEASURES&lt;/span>
+&lt;span class="n">A&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">id&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">aid&lt;/span>&lt;span class="p">,&lt;/span>
+&lt;span class="n">B&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">id&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">bid&lt;/span>&lt;span class="p">,&lt;/span>
+&lt;span class="k">C&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">id&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">cid&lt;/span>
+&lt;span class="n">PATTERN&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">A&lt;/span> &lt;span class="n">B&lt;/span> &lt;span class="k">C&lt;/span>&lt;span class="p">)&lt;/span>
+&lt;span class="n">DEFINE&lt;/span>
+&lt;span class="n">A&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">name&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s1">&amp;#39;a&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span>
+&lt;span class="n">B&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">name&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s1">&amp;#39;b&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span>
+&lt;span class="k">C&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">name&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s1">&amp;#39;c&amp;#39;&lt;/span>
+&lt;span class="p">)&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">T&lt;/span>&lt;/code>&lt;/pre>&lt;/div>
+&lt;/div>
+&lt;/p>
+&lt;p>The above query finds out ordered sets of events that have names &amp;lsquo;a&amp;rsquo;, &amp;lsquo;b&amp;rsquo; and &amp;lsquo;c&amp;rsquo;. Apart from this basic usage of
+MATCH_RECOGNIZE, I supported a few of other crucial features such as quantifiers and row pattern navigation. I will spell out
+the details in later sections.&lt;/p>
+&lt;h2 id="approach--discussion">Approach &amp;amp; Discussion&lt;/h2>
+&lt;p>The implementation is strongly based on BEAM core transforms. Specifically, one MATCH_RECOGNIZE execution composes the
+following series of transforms:&lt;/p>
+&lt;ol>
+&lt;li>A &lt;code>ParDo&lt;/code> transform and then a &lt;code>GroupByKey&lt;/code> transform that build up the partitions (PARTITION BY).&lt;/li>
+&lt;li>A &lt;code>ParDo&lt;/code> transform that sorts within each partition (ORDER BY).&lt;/li>
+&lt;li>A &lt;code>ParDo&lt;/code> transform that applies pattern-match in each sorted partition.&lt;/li>
+&lt;/ol>
+&lt;p>A pattern-match operation was first done with the java regex library. That is, I first transform rows within a partition into
+a string and then apply regex pattern-match routines. If a row satisfies a condition, then I output the corresponding pattern variable.
+This is ok under the assumption that the pattern definitions are mutually exclusive. That is, a pattern definition like &lt;code>A AS A.price &amp;gt; 0, B AS b.price &amp;lt; 0&lt;/code> is allowed while
+a pattern definition like &lt;code>A AS A.price &amp;gt; 0, B AS B.proctime &amp;gt; 0&lt;/code> might results in an incomplete match. For the latter case,
+an event can satisfy the conditions A and B at the same time. Mutually exclusive conditions gives deterministic pattern-match:
+each event can only belong to at most one pattern class.&lt;/p>
+&lt;p>As specified in the SQL 2016 document, MATCH_RECOGNIZE defines a richer set of expression than regular expression. Specifically,
+it introduces &lt;em>Row Pattern Navigation Operations&lt;/em> such as &lt;code>PREV&lt;/code> and &lt;code>NEXT&lt;/code>. This is perhaps one of the most intriguing feature of
+MATCH_RECOGNIZE. A regex library would no longer suffice the need since the pattern definition could be back-referencing (&lt;code>PREV&lt;/code>) or
+forward-referencing (&lt;code>NEXT&lt;/code>). So for the second version of implementation, we chose to use an NFA regex engine. An NFA brings more flexibility
+in terms of non-determinism (see Chapter 6 of SQL 2016 Part 5 for a more thorough discussion). My proposed NFA is based on a paper of UMASS.&lt;/p>
+&lt;p>This is a working project. Many of the components are still not supported. I will list some unimplemented work in the section
+of future work.&lt;/p>
+&lt;h2 id="usages">Usages&lt;/h2>
+&lt;p>For now, the components I supported are:&lt;/p>
+&lt;ul>
+&lt;li>PARTITION BY&lt;/li>
+&lt;li>ORDER BY&lt;/li>
+&lt;li>MEASURES
+&lt;ol>
+&lt;li>LAST&lt;/li>
+&lt;li>FIRST&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;li>ONE ROW PER MATCH/ALL ROWS PER MATCH&lt;/li>
+&lt;li>DEFINE
+&lt;ol>
+&lt;li>Left side of the condition
+&lt;ol>
+&lt;li>LAST&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;li>Right side of the condition
+&lt;ol>
+&lt;li>PREV&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;li>Quantifier
+&lt;ol>
+&lt;li>Kleene plus&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;/ul>
+&lt;p>The pattern definition evaluation is hard coded. To be more specific, it expects the column reference of the incoming row
+to be on the left side of a comparator. Additionally, PREV function can only appear on the right side of the comparator.&lt;/p>
+&lt;p>With these limited tools, we could already write some slightly more complicated queries. Imagine we have the following
+table:&lt;/p>
+&lt;table>
+&lt;thead>
+&lt;tr>
+&lt;th align="center">transTime&lt;/th>
+&lt;th align="center">price&lt;/th>
+&lt;/tr>
+&lt;/thead>
+&lt;tbody>
+&lt;tr>
+&lt;td align="center">1&lt;/td>
+&lt;td align="center">3&lt;/td>
+&lt;/tr>
+&lt;tr>
+&lt;td align="center">2&lt;/td>
+&lt;td align="center">2&lt;/td>
+&lt;/tr>
+&lt;tr>
+&lt;td align="center">3&lt;/td>
+&lt;td align="center">1&lt;/td>
+&lt;/tr>
+&lt;tr>
+&lt;td align="center">4&lt;/td>
+&lt;td align="center">5&lt;/td>
+&lt;/tr>
+&lt;tr>
+&lt;td align="center">5&lt;/td>
+&lt;td align="center">6&lt;/td>
+&lt;/tr>
+&lt;/tbody>
+&lt;/table>
+&lt;p>This table reflects the price changes of a product with respect to the transaction time. We could write the following
+query:&lt;/p>
+&lt;div class=language-sql>
+&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-sql" data-lang="sql">&lt;span class="k">SELECT&lt;/span> &lt;span class="o">*&lt;/span>
+&lt;span class="k">FROM&lt;/span> &lt;span class="n">MyTable&lt;/span>
+&lt;span class="n">MATCH_RECOGNIZE&lt;/span> &lt;span class="p">(&lt;/span>
+&lt;span class="k">ORDER&lt;/span> &lt;span class="k">BY&lt;/span> &lt;span class="n">transTime&lt;/span>
+&lt;span class="n">MEASURES&lt;/span>
+&lt;span class="k">LAST&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">A&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">price&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">beforePrice&lt;/span>&lt;span class="p">,&lt;/span>
+&lt;span class="k">FIRST&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">B&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">price&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">afterPrice&lt;/span>
+&lt;span class="n">PATTERN&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">A&lt;/span>&lt;span class="o">+&lt;/span> &lt;span class="n">B&lt;/span>&lt;span class="o">+&lt;/span>&lt;span class="p">)&lt;/span>
+&lt;span class="n">DEFINE&lt;/span>
+&lt;span class="n">A&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">price&lt;/span> &lt;span class="o">&amp;lt;&lt;/span> &lt;span class="n">PREV&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">A&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">price&lt;/span>&lt;span class="p">),&lt;/span>
+&lt;span class="n">B&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">price&lt;/span> &lt;span class="o">&amp;gt;&lt;/span> &lt;span class="n">PREV&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">B&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">price&lt;/span>&lt;span class="p">)&lt;/span>
+&lt;span class="p">)&lt;/span> &lt;span class="k">AS&lt;/span> &lt;span class="n">T&lt;/span>&lt;/code>&lt;/pre>&lt;/div>
+&lt;/div>
+&lt;p>This will find the local minimum price and the price after it. For the example dataset, the first 3 rows will be
+mapped to A and the rest of the rows will be mapped to B. Thus, we will have (1, 5) as the result.&lt;/p>
+&lt;blockquote>
+&lt;p>Very important: For my NFA implementation, it slightly breaks the rule in the SQL standard. Since the buffered NFA
+only stores an event to the buffer if the event is a match to some pattern class, There would be no way to get the
+previous event back if the previous row is discarded. So the first row would always be a match (different from the standard)
+if PREV is used.&lt;/p>
+&lt;/blockquote>
+&lt;h2 id="progress">Progress&lt;/h2>
+&lt;ol>
+&lt;li>PRs
+&lt;ol>
+&lt;li>&lt;a href="https://github.com/apache/beam/pull/12232">Support MATCH_RECOGNIZE using regex library&lt;/a> (merged)&lt;/li>
+&lt;li>&lt;a href="https://github.com/apache/beam/pull/12532">Support MATCH_RECOGNIZE using NFA&lt;/a> (pending)&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;li>Commits
+&lt;ol>
+&lt;li>partition by: &lt;a href="https://github.com/apache/beam/pull/12232/commits/064ada7257970bcb1d35530be1b88cb3830f242b">commit 064ada7&lt;/a>&lt;/li>
+&lt;li>order by: &lt;a href="https://github.com/apache/beam/pull/12232/commits/9cd1a82bec7b2f7c44aacfbd72f5f775bb58b650">commit 9cd1a82&lt;/a>&lt;/li>
+&lt;li>regex pattern match: &lt;a href="https://github.com/apache/beam/pull/12232/commits/8d6ffcc213e30999fc495c119b68da4f62fad258">commit 8d6ffcc&lt;/a>&lt;/li>
+&lt;li>support quantifiers: &lt;a href="https://github.com/apache/beam/pull/12232/commits/f529b876a2c2e43d012c71b3a83ebd55eb16f4ff">commit f529b87&lt;/a>&lt;/li>
+&lt;li>measures: &lt;a href="https://github.com/apache/beam/pull/12232/commits/87935746647611aa139d664ebed10c8e638bb024">commit 8793574&lt;/a>&lt;/li>
+&lt;li>added NFA implementation: &lt;a href="https://github.com/apache/beam/pull/12532/commits/fc731f2b0699d11853e7b76da86456427d434a2a">commit fc731f2&lt;/a>&lt;/li>
+&lt;li>implemented functions PREV and LAST: &lt;a href="https://github.com/apache/beam/pull/12532/commits/fc731f2b0699d11853e7b76da86456427d434a2a">commit 35323da&lt;/a>&lt;/li>
+&lt;/ol>
+&lt;/li>
+&lt;/ol>
+&lt;h2 id="future-work">Future Work&lt;/h2>
+&lt;ul>
+&lt;li>Support FINAL/RUNNING keywords.&lt;/li>
+&lt;li>Support more quantifiers.&lt;/li>
+&lt;li>Add optimization to the NFA.&lt;/li>
+&lt;li>A better way to realize MATCH_RECOGNIZE might be having a Complex Event Processing library at BEAM core (rather than using BEAM transforms).&lt;/li>
+&lt;/ul>
+&lt;!-- Related Documents:
+- proposal
+- design doc
+- SQL 2016 standard
+- UMASS NFA^b paper
+-->
+&lt;h2 id="references">References&lt;/h2>
+&lt;ul>
+&lt;li>&lt;a href="https://drive.google.com/file/d/1ZuFZV4dCFVPZW_-RiqbU0w-vShaZh_jX/view?usp=sharing">Project Proposal&lt;/a>&lt;/li>
+&lt;li>&lt;a href="https://s.apache.org/beam-sql-pattern-recognization">Design Documentation&lt;/a>&lt;/li>
+&lt;li>&lt;a href="https://www.iso.org/standard/65143.html">SQL 2016 documentation Part 5&lt;/a>&lt;/li>
+&lt;li>&lt;a href="https://dl.acm.org/doi/10.1145/1376616.1376634">UMASS paper on NFA with shared buffer&lt;/a>&lt;/li>
+&lt;/ul></description><link>/blog/pattern-match-beam-sql/</link><pubDate>Thu, 27 Aug 2020 00:00:01 +0800</pubDate><guid>/blog/pattern-match-beam-sql/</guid><category>blog</category></item><item><title>Improved Annotation Support for the Python SDK</title><description>
 &lt;!--
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
@@ -667,90 +855,4 @@ See the &lt;a href="/get-started/downloads/#2190-2020-02-04">download page&lt;/a
 , Kengo Seki, Kenneth Jung, Kenneth Knowles, Kyle Weaver, Kyle Winkelman, Lukas Drbal, Marek Simunek, Mark Liu, Maximilian Michels, Melissa Pashniak
 , Michael Luckey, Michal Walenia, Mike Pedersen, Mikhail Gryzykhin, Niel Markwick, Pablo Estrada, Pascal Gula, Rehman Murad Ali, Reuven Lax, Rob, Robbe Sneyders
 , Robert Bradshaw, Robert Burke, Rui Wang, Ruoyun Huang, Ryan Williams, Sam Rohde, Sam Whittle, Scott Wegner, Shoaib Zafar, Thomas Weise, Tianyang Hu, Tyler Akidau
-, Udi Meiri, Valentyn Tymofieiev, Xinyu Liu, XuMingmin, ttanay, tvalentyn, Łukasz Gajowy&lt;/p></description><link>/blog/beam-2.20.0/</link><pubDate>Wed, 15 Apr 2020 00:00:01 -0800</pubDate><guid>/blog/beam-2.20.0/</guid><category>blog</category></item><item><title>Apache Beam 2.19.0</title><description>
-&lt;!--
-Licensed under the Apache License, Version 2.0 (the "License");
-you may not use this file except in compliance with the License.
-You may obtain a copy of the License at
-http://www.apache.org/licenses/LICENSE-2.0
-Unless required by applicable law or agreed to in writing, software
-distributed under the License is distributed on an "AS IS" BASIS,
-WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-See the License for the specific language governing permissions and
-limitations under the License.
--->
-&lt;p>We are happy to present the new 2.19.0 release of Beam. This release includes both improvements and new functionality.
-See the &lt;a href="/get-started/downloads/#2190-2020-02-04">download page&lt;/a> for this release.&lt;/p>
-&lt;p>For more information on changes in 2.19.0, check out the
-&lt;a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&amp;amp;version=12346582">detailed release notes&lt;/a>.&lt;/p>
-&lt;h2 id="highlights">Highlights&lt;/h2>
-&lt;ul>
-&lt;li>Multiple improvements made into Python SDK harness:
-&lt;a href="https://issues.apache.org/jira/browse/BEAM-8624">BEAM-8624&lt;/a>,
-&lt;a href="https://issues.apache.org/jira/browse/BEAM-8623">BEAM-8623&lt;/a>,
-&lt;a href="https://issues.apache.org/jira/browse/BEAM-7949">BEAM-7949&lt;/a>,
-&lt;a href="https://issues.apache.org/jira/browse/BEAM-8935">BEAM-8935&lt;/a>,
-&lt;a href="https://issues.apache.org/jira/browse/BEAM-8816">BEAM-8816&lt;/a>&lt;/li>
-&lt;/ul>
-&lt;h3 id="ios">I/Os&lt;/h3>
-&lt;ul>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-1440">BEAM-1440&lt;/a> Create a BigQuery source (that implements iobase.BoundedSource) for Python SDK&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-2572">BEAM-2572&lt;/a> Implement an S3 filesystem for Python SDK&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-5192">BEAM-5192&lt;/a> Support Elasticsearch 7.x&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8745">BEAM-8745&lt;/a> More fine-grained controls for the size of a BigQuery Load job&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8801">BEAM-8801&lt;/a> PubsubMessageToRow should not check useFlatSchema() in processElement&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8953">BEAM-8953&lt;/a> Extend ParquetIO.Read/ReadFiles.Builder to support Avro GenericData model&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8946">BEAM-8946&lt;/a> Report collection size from MongoDBIOIT&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8978">BEAM-8978&lt;/a> Report saved data size from HadoopFormatIOIT&lt;/li>
-&lt;/ul>
-&lt;h3 id="new-features--improvements">New Features / Improvements&lt;/h3>
-&lt;ul>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-6008">BEAM-6008&lt;/a> Improve error reporting in Java/Python PortableRunner&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8296">BEAM-8296&lt;/a> Containerize the Spark job server&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8746">BEAM-8746&lt;/a> Allow the local job service to work from inside docker&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8837">BEAM-8837&lt;/a> PCollectionVisualizationTest: possible bug&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8139">BEAM-8139&lt;/a> Execute portable Spark application jar&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9019">BEAM-9019&lt;/a> Improve Spark Encoders (wrappers of beam coders)&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9053">BEAM-9053&lt;/a> Improve error message when unable to get the correct filesystem for specified path in Python SDK) Improve error message when unable to get the correct filesystem for specified path in Python SDK&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9055">BEAM-9055&lt;/a> Unify the config names of Fn Data API across languages&lt;/li>
-&lt;/ul>
-&lt;h3 id="sql">SQL&lt;/h3>
-&lt;ul>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-5690">BEAM-5690&lt;/a> Issue with GroupByKey in BeamSql using SparkRunner&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8993">BEAM-8993&lt;/a> [SQL] MongoDb should use predicate push-down&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8844">BEAM-8844&lt;/a> [SQL] Create performance tests for BigQueryTable&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9023">BEAM-9023&lt;/a> Upgrade to ZetaSQL 2019.12.1&lt;/li>
-&lt;/ul>
-&lt;h3 id="breaking-changes">Breaking Changes&lt;/h3>
-&lt;ul>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8989">BEAM-8989&lt;/a> Backwards incompatible change in ParDo.getSideInputs (caught by failure when running Apache Nemo quickstart)&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8402">BEAM-8402&lt;/a> Backwards incompatible change related to how Environments are represented in Python &lt;code>DirectRunner&lt;/code>.&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9218">BEAM-9218&lt;/a> Template staging broken on Beam 2.18.0&lt;/li>
-&lt;/ul>
-&lt;h3 id="dependency-changes">Dependency Changes&lt;/h3>
-&lt;ul>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8696">BEAM-8696&lt;/a> Beam Dependency Update Request: com.google.protobuf:protobuf-java&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8701">BEAM-8701&lt;/a> Beam Dependency Update Request: commons-io:commons-io&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8716">BEAM-8716&lt;/a> Beam Dependency Update Request: org.apache.commons:commons-csv&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8717">BEAM-8717&lt;/a> Beam Dependency Update Request: org.apache.commons:commons-lang3&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8749">BEAM-8749&lt;/a> Beam Dependency Update Request: com.datastax.cassandra:cassandra-driver-mapping&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-5546">BEAM-5546&lt;/a> Beam Dependency Update Request: commons-codec:commons-codec&lt;/li>
-&lt;/ul>
-&lt;h3 id="bugfixes">Bugfixes&lt;/h3>
-&lt;ul>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9123">BEAM-9123&lt;/a> HadoopResourceId returns wrong directory name&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8962">BEAM-8962&lt;/a> FlinkMetricContainer causes churn in the JobManager and lets the web frontend malfunction&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-5495">BEAM-5495&lt;/a> PipelineResources algorithm is not working in most environments&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8025">BEAM-8025&lt;/a> Cassandra IO classMethod test is flaky&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8577">BEAM-8577&lt;/a> FileSystems may have not be initialized during ResourceId deserialization&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8582">BEAM-8582&lt;/a> Python SDK emits duplicate records for Default and AfterWatermark triggers&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8943">BEAM-8943&lt;/a> SDK harness servers don&amp;rsquo;t shut down properly when SDK harness environment cleanup fails&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8995">BEAM-8995&lt;/a> apache_beam.io.gcp.bigquery_read_it_test failing on Py3.5 PC with: TypeError: the JSON object must be str, not &amp;lsquo;bytes&amp;rsquo;&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-8999">BEAM-8999&lt;/a> PGBKCVOperation does not respect timestamp combiners&lt;/li>
-&lt;li>&lt;a href="https://issues.apache.org/jira/browse/BEAM-9050">BEAM-9050&lt;/a> Beam pickler doesn&amp;rsquo;t pickle classes that have &lt;strong>module&lt;/strong> set to None.&lt;/li>
-&lt;li>&lt;/li>
-&lt;li>Various bug fixes and performance improvements.&lt;/li>
-&lt;/ul>
-&lt;h2 id="list-of-contributors">List of Contributors&lt;/h2>
-&lt;p>According to git shortlog, the following people contributed to the 2.19.0 release. Thank you to all contributors!&lt;/p>
-&lt;p>Ahmet Altay, Alex Amato, Alexey Romanenko, Andrew Pilloud, Ankur Goenka, Anton Kedin, Boyuan Zhang, Brian Hulette, Brian Martin, Chamikara Jayalath, Charles Chen, Craig Chambers, Daniel Oliveira, David Moravek, David Rieber, Dustin Rhodes, Etienne Chauchot, Gleb Kanterov, Hai Lu, Heejong Lee, Ismaël Mejía, Jan Lukavský, Jason Kuster, Jean-Baptiste Onofré, Jeff Klukas, João Cabrita, J Ross Thomson, Juan Rael, Juta, Kasia Kucharczyk, Kengo Seki, Kenneth Jung, Kenneth Knowles, Kyle We [...]
\ No newline at end of file
+, Udi Meiri, Valentyn Tymofieiev, Xinyu Liu, XuMingmin, ttanay, tvalentyn, Łukasz Gajowy&lt;/p></description><link>/blog/beam-2.20.0/</link><pubDate>Wed, 15 Apr 2020 00:00:01 -0800</pubDate><guid>/blog/beam-2.20.0/</guid><category>blog</category></item></channel></rss>
\ No newline at end of file
diff --git a/website/generated-content/index.html b/website/generated-content/index.html
index 931e8dd..cbd6ef0 100644
--- a/website/generated-content/index.html
+++ b/website/generated-content/index.html
@@ -5,7 +5,7 @@
 <a class="button button--primary" href=/get-started/try-apache-beam/>Try Beam</a>
 <a class="button button--primary" href=/get-started/downloads/>Download Beam SDK 2.23.0</a></div><div class=hero__ctas><a class=button href=/get-started/quickstart-java/>Java Quickstart</a>
 <a class=button href=/get-started/quickstart-py/>Python Quickstart</a>
-<a class=button href=/get-started/quickstart-go/>Go Quickstart</a></div></div></div><div class=hero__cols__col><div class=hero__blog><div class=hero__blog__title>The latest from the blog</div><div class=hero__blog__cards><a class=hero__blog__cards__card href=/blog/python-improved-annotations/><div class=hero__blog__cards__card__title>Improved Annotation Support for the Python SDK</div><div class=hero__blog__cards__card__date>Aug 21, 2020</div></a><a class=hero__blog__cards__card href=/bl [...]
+<a class=button href=/get-started/quickstart-go/>Go Quickstart</a></div></div></div><div class=hero__cols__col><div class=hero__blog><div class=hero__blog__title>The latest from the blog</div><div class=hero__blog__cards><a class=hero__blog__cards__card href=/blog/pattern-match-beam-sql/><div class=hero__blog__cards__card__title>Pattern Matching with Beam SQL</div><div class=hero__blog__cards__card__date>Aug 27, 2020</div></a><a class=hero__blog__cards__card href=/blog/python-improved-an [...]
 <a class="button button--primary" href=/get-started/downloads/>Download Beam SDK 2.23.0</a></div><div class=ctas__ctas><a class=button href=/get-started/quickstart-java/>Java Quickstart</a>
 <a class=button href=/get-started/quickstart-py/>Python Quickstart</a>
 <a class=button href=/get-started/quickstart-go/>Go Quickstart</a></div></div></div><footer class=footer><div class=footer__contained><div class=footer__cols><div class=footer__cols__col><div class=footer__cols__col__logo><img src=/images/beam_logo_circle.svg class=footer__logo alt="Beam logo"></div><div class=footer__cols__col__logo><img src=/images/apache_logo_circle.svg class=footer__logo alt="Apache logo"></div></div><div class="footer__cols__col footer__cols__col--md"><div class=foo [...]
diff --git a/website/generated-content/sitemap.xml b/website/generated-content/sitemap.xml
index 0b4c6a8..261cbcf 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/categories/blog/</loc><lastmod>2020-08-26T13:09:05-05:00</lastmod></url><url><loc>/blog/</loc><lastmod>2020-08-26T13:09:05-05:00</lastmod></url><url><loc>/categories/</loc><lastmod>2020-08-26T13:09:05-05:00</lastmod></url><url><loc>/blog/python-improved-annotations/</loc><lastmod>2020-08-26T13:09:05-05:00</lastmod></url>< [...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/categories/blog/</loc><lastmod>2020-09-18T04:29:24+08:00</lastmod></url><url><loc>/blog/</loc><lastmod>2020-09-18T04:29:24+08:00</lastmod></url><url><loc>/categories/</loc><lastmod>2020-09-18T04:29:24+08:00</lastmod></url><url><loc>/blog/pattern-match-beam-sql/</loc><lastmod>2020-09-18T04:29:24+08:00</lastmod></url><url>< [...]
\ No newline at end of file