You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@beam.apache.org by me...@apache.org on 2017/08/21 18:49:29 UTC

[beam-site] branch mergebot updated (cda8a7d -> 0c06b02)

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


    from cda8a7d  This closes #291
     add 4c5cf9a  Prepare repository for deployment.
     new 20ccf5c  add a page for DSLs:SQL
     new 8d79b65  fix typo issues
     new c8ae4c4  add intro of BeamSqlRow, trigger settings of aggregation/join,
     new 6311936  typo fix
     new ba662cd  change page dsls/sql.md into a blog post
     new 581ec75  split into 1 blog + 1 development doc.
     new e3b4815  fix broken link
     new 2a52cb3  update to latest API/usages
     new 292e22c  remove post page move internal imple to the end of page
     new 0c06b02  This closes #268

The 10 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/contribute/team/index.html   |   9 +
 src/_includes/header.html            |   4 +
 src/documentation/dsls/sql.md        | 381 +++++++++++++++++++++++++++++++++++
 src/images/beam_sql_dsl_workflow.png | Bin 0 -> 73461 bytes
 4 files changed, 394 insertions(+)
 create mode 100644 src/documentation/dsls/sql.md
 create mode 100644 src/images/beam_sql_dsl_workflow.png

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" <co...@beam.apache.org>'].

[beam-site] 02/10: fix typo issues

Posted by me...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 8d79b6501ec1e9daf73f79f5a2da662ac621f34e
Author: mingmxu <mi...@ebay.com>
AuthorDate: Mon Jul 10 10:24:05 2017 -0700

    fix typo issues
---
 src/documentation/dsls/sql.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/documentation/dsls/sql.md b/src/documentation/dsls/sql.md
index df553ff..c9cb0f7 100644
--- a/src/documentation/dsls/sql.md
+++ b/src/documentation/dsls/sql.md
@@ -24,7 +24,7 @@ Figure 1 describes the back-end steps from a SQL statement to a Beam `PTransform
 
 **Figure 1** workflow of Beam SQL DSL
 
-Given a PCollection and the query as input, first of all the input PCollections is registered as a table in the schema repository. Then it's processed as:
+Given a PCollection and the query as input, first of all the input PCollection is registered as a table in the schema repository. Then it's processed as:
 
 1. SQL query is parsed according to grammar to generate a SQL Abstract Syntax Tree;
 2. Validate against table schema, and output a logical plan represented with relational algebras;
@@ -108,7 +108,7 @@ SELECT f_int, COUNT(*) AS `size` FROM PCOLLECTION GROUP BY f_int, SESSION(f_time
 
 Note: distinct aggregation is not supported yet.
 
-**4. Join (inner, left_outer, right_out);**
+**4. Join (inner, left_outer, right_outer);**
 
 The scenarios of join can be categorized into 3 cases:
 
@@ -151,7 +151,7 @@ PCollection<BeamSqlRow> result =
 
 **create and specify User Defined Aggregate Function (UDAF)**
 
-A UDAF aggregates a set of grouped scalar values, and output a single scalar value. To create a UDAF function, it's required to extent `org.apache.beam.dsls.sql.schema.BeamSqlUdaf<InputT, AccumT, OutputT>`, which defines 4 methods to process an aggregation:
+A UDAF aggregates a set of grouped scalar values, and output a single scalar value. To create a UDAF function, it's required to extend `org.apache.beam.dsls.sql.schema.BeamSqlUdaf<InputT, AccumT, OutputT>`, which defines 4 methods to process an aggregation:
 
 1. init(), to create an initial accumulate value;
 2. add(), to apply a new value to existing accumulate value;

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" <co...@beam.apache.org>.

[beam-site] 04/10: typo fix

Posted by me...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 631193622537f019e9baf57d24c785be32c292e8
Author: mingmxu <mi...@ebay.com>
AuthorDate: Wed Jul 19 11:14:14 2017 -0700

    typo fix
---
 src/documentation/dsls/sql.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/documentation/dsls/sql.md b/src/documentation/dsls/sql.md
index 8834a93..76574e5 100644
--- a/src/documentation/dsls/sql.md
+++ b/src/documentation/dsls/sql.md
@@ -32,7 +32,7 @@ Given a `PCollection` and the query as input, first of all the input `PCollectio
 3. Relational rules are applied to convert it to a physical plan, expressed with Beam components. An optimizer is optional to update the plan;
 4. Finally, the Beam physical plan is compiled as a composite `PTransform`;
 
-Here is an example to show a query that filters and projects from a input `PCollection`:
+Here is an example to show a query that filters and projects from an input `PCollection`:
 
 ```
 SELECT USER_ID, USER_NAME FROM PCOLLECTION WHERE USER_ID = 1

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" <co...@beam.apache.org>.

[beam-site] 01/10: add a page for DSLs:SQL

Posted by me...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 20ccf5c4babf4ca75391a282c3e73abc18b1118d
Author: mingmxu <mi...@ebay.com>
AuthorDate: Fri Jul 7 22:32:40 2017 -0700

    add a page for DSLs:SQL
---
 src/_includes/header.html            |   4 +
 src/documentation/dsls/sql.md        | 344 +++++++++++++++++++++++++++++++++++
 src/images/beam_sql_dsl_workflow.png | Bin 0 -> 73461 bytes
 3 files changed, 348 insertions(+)

diff --git a/src/_includes/header.html b/src/_includes/header.html
index 3bd4ced..79e72b6 100644
--- a/src/_includes/header.html
+++ b/src/_includes/header.html
@@ -62,6 +62,10 @@
             <li><a href="{{ site.baseurl }}/documentation/runners/flink/">Apache Flink Runner</a></li>
             <li><a href="{{ site.baseurl }}/documentation/runners/spark/">Apache Spark Runner</a></li>
             <li><a href="{{ site.baseurl }}/documentation/runners/dataflow/">Cloud Dataflow Runner</a></li>
+
+            <li role="separator" class="divider"></li>
+            <li class="dropdown-header">DSLs</li>
+            <li><a href="{{ site.baseurl }}/documentation/dsls/sql/">SQL</a></li>
           </ul>
         </li>
         <li class="dropdown">
diff --git a/src/documentation/dsls/sql.md b/src/documentation/dsls/sql.md
new file mode 100644
index 0000000..df553ff
--- /dev/null
+++ b/src/documentation/dsls/sql.md
@@ -0,0 +1,344 @@
+---
+layout: default
+title: "DSLs: SQL"
+permalink: /documentation/dsls/sql/
+---
+
+* [1. Overview](#overview)
+* [2. The Internal of Beam SQL](#internal-of-sql)
+* [3. Usage of DSL APIs](#usage)
+* [4. Functionality in Beam SQL](#functionality)
+  * [4.1. Supported Features](#features)
+  * [4.2. Data Types](#data-type)
+  * [4.3. built-in SQL functions](#built-in-functions)
+
+This page describes the implementation of Beam SQL, and how to simplify a Beam pipeline with DSL APIs.
+
+# <a name="overview"></a>1. Overview
+SQL is a well-adopted standard to process data with concise syntax. With DSL APIs, now `PCollection`s can be queried with standard SQL statements, like a regular table. The DSL APIs leverages [Apache Calcite](http://calcite.apache.org/) to parse and optimize SQL queries, then translate into a composite Beam `PTransform`. In this way, both SQL and normal Beam `PTransform`s can be mixed in the same pipeline.
+
+# <a name="internal-of-sql"></a>2. The Internal of Beam SQL
+Figure 1 describes the back-end steps from a SQL statement to a Beam `PTransform`.
+
+![Workflow of Beam SQL DSL]({{ "/images/beam_sql_dsl_workflow.png" | prepend: site.baseurl }} "workflow of Beam SQL DSL")
+
+**Figure 1** workflow of Beam SQL DSL
+
+Given a PCollection and the query as input, first of all the input PCollections is registered as a table in the schema repository. Then it's processed as:
+
+1. SQL query is parsed according to grammar to generate a SQL Abstract Syntax Tree;
+2. Validate against table schema, and output a logical plan represented with relational algebras;
+3. Relational rules are applied to convert it to a physical plan, expressed with Beam components. An optimizer is optional to update the plan;
+4. Finally, the Beam physical plan is compiled as a composite `PTransform`;
+
+Here is an example to show a query that filters and projects from a input PCollection:
+
+```
+SELECT USER_ID, USER_NAME FROM PCOLLECTION WHERE USER_ID = 1
+```
+
+The logical plan is shown as:
+
+```
+LogicalProject(USER_ID=[$0], USER_NAME=[$1])
+  LogicalFilter(condition=[=($0, 1)])
+    LogicalTableScan(table=[[PCOLLECTION]])
+```
+
+And compiled as a composite `PTransform`
+
+```
+pCollection.apply(BeamSqlFilter...)
+           .apply(BeamSqlProject...)
+```
+
+# <a name="usage"></a>3. Usage of DSL APIs 
+The DSL interface (`BeamSql.query()` and `BeamSql.simpleQuery()`), is the only endpoint exposed to developers. It wraps the back-end details of parsing/validation/assembling, to deliver a Beam SDK style API that can take either simple TABLE_FILTER queries or complex queries containing JOIN/GROUP_BY etc. 
+
+*Note*, the two APIs are equivalent in functionality, `BeamSql.query()` applies on a `PCollectionTuple` with one or many input `PCollection`s, and `BeamSql.simpleQuery()` is a simplified API which applies on single PCollections.
+
+[BeamSqlExample](https://github.com/apache/beam/blob/DSL_SQL/dsls/sql/src/main/java/org/apache/beam/dsls/sql/example/BeamSqlExample.java) in code repository shows the usage of both APIs:
+
+```
+//Step 1. create a source PCollection with Create.of();
+PCollection<BeamSqlRow> inputTable = PBegin.in(p).apply(Create.of(row)
+    .withCoder(new BeamSqlRowCoder(type)));
+
+//Step 2. (Case 1) run a simple SQL query over input PCollection with BeamSql.simpleQuery;
+PCollection<BeamSqlRow> outputStream = inputTable.apply(
+    BeamSql.simpleQuery("select c2, c3 from PCOLLECTION where c1=1"));
+
+
+//Step 2. (Case 2) run the query with BeamSql.query
+PCollection<BeamSqlRow> outputStream2 =
+    PCollectionTuple.of(new TupleTag<BeamSqlRow>("TABLE_B"), inputTable)
+        .apply(BeamSql.query("select c2, c3 from TABLE_B where c1=1"));
+```
+
+In Step 1, a `PCollection<BeamSqlRow>` is prepared as the source dataset. The work to generate a queriable `PCollection<BeamSqlRow>` is beyond the scope of Beam SQL DSL. 
+
+Step 2(Case 1) shows the way to run a query with `BeamSql.simpleQuery()`, be aware that the input PCollection is named with a fixed table name __PCOLLECTION__. Step 2(Case 2) is an example to run a query with `BeamSql.query()`. A Table name used in the query is specified when adding PCollection to `PCollectionTuple`. As each call of either `BeamSql.query()` or `BeamSql.simpleQuery()` has its own schema repository, developers need to include all PCollections that would be used in your query.
+
+# <a name="functionality"></a>4. Functionality in Beam SQL
+Just as the unified model for both bounded and unbounded data in Beam, SQL DSL provides the same functionalities for bounded and unbounded PCollection as well. 
+
+Note that, SQL support is not fully completed. Queries that include unsupported features would cause a UnsupportedOperationException.
+
+## <a name="features"></a>4.1. Supported Features
+The following features are supported in current repository (this chapter will be updated continually).
+
+**1. filter clauses;**
+
+**2. data field projections;**
+
+**3. aggregations;**
+
+Beam SQL supports aggregation functions COUNT/SUM/MAX/MIN/AVG with group_by in global_window, fixed_window, sliding_window and session_window. A field with type `TIMESTAMP` is required to specify fixed_window/sliding_window/session window, which is used as event timestamp for rows. See below for several examples:
+
+```
+//fixed window, one hour in duration
+SELECT f_int, COUNT(*) AS `size` FROM PCOLLECTION GROUP BY f_int, TUMBLE(f_timestamp, INTERVAL '1' HOUR)
+
+//sliding window, one hour in duration and 30 minutes period
+SELECT f_int, COUNT(*) AS `size` FROM PCOLLECTION GROUP BY f_int, HOP(f_timestamp, INTERVAL '1' HOUR, INTERVAL '30' MINUTE)
+
+//session window, with 5 minutes gap duration
+SELECT f_int, COUNT(*) AS `size` FROM PCOLLECTION GROUP BY f_int, SESSION(f_timestamp, INTERVAL '5' MINUTE)
+```
+
+Note: distinct aggregation is not supported yet.
+
+**4. Join (inner, left_outer, right_out);**
+
+The scenarios of join can be categorized into 3 cases:
+
+1. BoundedTable JOIN BoundedTable
+2. UnboundedTable JOIN UnboundedTable
+3. BoundedTable JOIN UnboundedTable
+
+For case 1 and case 2, a standard join is utilized as long as the windowFn of the both sides match. For case 3, sideInput is utilized to implement the join. So there are some constraints:
+
+* Only equal-join is supported, CROSS JOIN is not supported;
+* FULL OUTER JOIN is not supported;
+* If it's a LEFT OUTER JOIN, the unbounded table should on the left side; If it's a RIGHT OUTER JOIN, the unbounded table should on the right side;
+
+**5. built-in SQL functions**
+
+**6. User Defined Function (UDF) and User Defined Aggregate Function (UDAF);**
+
+If the required function is not available, developers can register their own UDF(for scalar function) and UDAF(for aggregation function).
+
+**create and specify User Defined Function (UDF)**
+
+A UDF can be any Java method that takes zero or more scalar fields, and return one scalar value. Below is an example of UDF and how to use it in DSL:
+
+```
+/**
+ * A example UDF for test.
+ */
+public static class CubicInteger{
+  public static Integer cubic(Integer input){
+    return input * input * input;
+  }
+}
+
+// register and call in SQL
+String sql = "SELECT f_int, cubic(f_int) as cubicvalue FROM PCOLLECTION WHERE f_int = 2";
+PCollection<BeamSqlRow> result =
+    input.apply("udfExample",
+        BeamSql.simpleQuery(sql).withUdf("cubic", CubicInteger.class, "cubic"));
+```
+
+**create and specify User Defined Aggregate Function (UDAF)**
+
+A UDAF aggregates a set of grouped scalar values, and output a single scalar value. To create a UDAF function, it's required to extent `org.apache.beam.dsls.sql.schema.BeamSqlUdaf<InputT, AccumT, OutputT>`, which defines 4 methods to process an aggregation:
+
+1. init(), to create an initial accumulate value;
+2. add(), to apply a new value to existing accumulate value;
+3. merge(), to merge accumulate values from parallel operators;
+4. result(), to generate the final result from accumulate value;
+
+Here's an example of UDAF:
+
+```
+/**
+ * UDAF for test, which returns the sum of square.
+ */
+public static class SquareSum extends BeamSqlUdaf<Integer, Integer, Integer> {
+  public SquareSum() {
+  }
+  
+  // @Override
+  public Integer init() {
+    return 0;
+  }
+  
+  // @Override
+  public Integer add(Integer accumulator, Integer input) {
+    return accumulator + input * input;
+  }
+  
+  // @Override
+  public Integer merge(Iterable<Integer> accumulators) {
+    int v = 0;
+    Iterator<Integer> ite = accumulators.iterator();
+    while (ite.hasNext()) {
+      v += ite.next();
+    }
+    return v;
+  }
+
+  // @Override
+  public Integer result(Integer accumulator) {
+    return accumulator;
+  }
+}
+
+//register and call in SQL
+String sql = "SELECT f_int1, squaresum(f_int2) AS `squaresum` FROM PCOLLECTION GROUP BY f_int2";
+PCollection<BeamSqlRow> result =
+    input.apply("udafExample",
+        BeamSql.simpleQuery(sql).withUdaf("squaresum", SquareSum.class));
+```
+ 
+## <a name="data-type"></a>4.2. Data Types
+Each type in Beam SQL maps to a Java class to holds the value in `BeamSqlRow`. The following table lists the relation between SQL types and Java classes, which are supported in current repository:
+
+| SQL Type | Java class |
+| ---- | ---- |
+| Types.INTEGER | java.lang.Integer |
+| Types.SMALLINT | java.lang.Short |
+| Types.TINYINT | java.lang.Byte |
+| Types.BIGINT | java.lang.Long |
+| Types.FLOAT | java.lang.Float |
+| Types.DOUBLE | java.lang.Double |
+| Types.DECIMAL | java.math.BigDecimal |
+| Types.VARCHAR | java.lang.String |
+| Types.TIMESTAMP | java.util.Date |
+{:.table}
+
+## <a name="built-in-functions"></a>4.3. built-in SQL functions
+
+Beam SQL has implemented lots of build-in functions defined in [Apache Calcite](http://calcite.apache.org). The available functions are listed as below:
+
+**Comparison functions and operators**
+
+| Operator syntax | Description |
+| ---- | ---- |
+| value1 = value2 | Equals |
+| value1 <> value2 | Not equal |
+| value1 > value2 | Greater than |
+| value1 >= value2 | Greater than or equal |
+| value1 < value2 | Less than |
+| value1 <= value2 | Less than or equal |
+| value IS NULL | Whether value is null |
+| value IS NOT NULL | Whether value is not null |
+{:.table}
+
+**Logical functions and operators**
+
+| Operator syntax | Description |
+| ---- | ---- |
+| boolean1 OR boolean2 | Whether boolean1 is TRUE or boolean2 is TRUE |
+| boolean1 AND boolean2 | Whether boolean1 and boolean2 are both TRUE |
+| NOT boolean | Whether boolean is not TRUE; returns UNKNOWN if boolean is UNKNOWN |
+{:.table}
+
+**Arithmetic functions and operators**
+
+| Operator syntax | Description|
+| ---- | ---- |
+| numeric1 + numeric2 | Returns numeric1 plus numeric2| 
+| numeric1 - numeric2 | Returns numeric1 minus numeric2| 
+| numeric1 * numeric2 | Returns numeric1 multiplied by numeric2| 
+| numeric1 / numeric2 | Returns numeric1 divided by numeric2| 
+| MOD(numeric, numeric) | Returns the remainder (modulus) of numeric1 divided by numeric2. The result is negative only if numeric1 is negative| 
+{:.table}
+
+**Math functions**
+
+| Operator syntax | Description |
+| ---- | ---- |
+| ABS(numeric) | Returns the absolute value of numeric |
+| SQRT(numeric) | Returns the square root of numeric |
+| LN(numeric) | Returns the natural logarithm (base e) of numeric |
+| LOG10(numeric) | Returns the base 10 logarithm of numeric |
+| EXP(numeric) | Returns e raised to the power of numeric |
+| ACOS(numeric) | Returns the arc cosine of numeric |
+| ASIN(numeric) | Returns the arc sine of numeric |
+| ATAN(numeric) | Returns the arc tangent of numeric |
+| COT(numeric) | Returns the cotangent of numeric |
+| DEGREES(numeric) | Converts numeric from radians to degrees |
+| RADIANS(numeric) | Converts numeric from degrees to radians |
+| SIGN(numeric) | Returns the signum of numeric |
+| SIN(numeric) | Returns the sine of numeric |
+| TAN(numeric) | Returns the tangent of numeric |
+| ROUND(numeric1, numeric2) | Rounds numeric1 to numeric2 places right to the decimal point |
+{:.table}
+
+**Date functions**
+
+| Operator syntax | Description |
+| ---- | ---- |
+| LOCALTIME | Returns the current date and time in the session time zone in a value of datatype TIME |
+| LOCALTIME(precision) | Returns the current date and time in the session time zone in a value of datatype TIME, with precision digits of precision |
+| LOCALTIMESTAMP | Returns the current date and time in the session time zone in a value of datatype TIMESTAMP |
+| LOCALTIMESTAMP(precision) | Returns the current date and time in the session time zone in a value of datatype TIMESTAMP, with precision digits of precision |
+| CURRENT_TIME | Returns the current time in the session time zone, in a value of datatype TIMESTAMP WITH TIME ZONE |
+| CURRENT_DATE | Returns the current date in the session time zone, in a value of datatype DATE |
+| CURRENT_TIMESTAMP | Returns the current date and time in the session time zone, in a value of datatype TIMESTAMP WITH TIME ZONE |
+| EXTRACT(timeUnit FROM datetime) | Extracts and returns the value of a specified datetime field from a datetime value expression |
+| FLOOR(datetime TO timeUnit) | Rounds datetime down to timeUnit |
+| CEIL(datetime TO timeUnit) | Rounds datetime up to timeUnit |
+| YEAR(date) | Equivalent to EXTRACT(YEAR FROM date). Returns an integer. |
+| QUARTER(date) | Equivalent to EXTRACT(QUARTER FROM date). Returns an integer between 1 and 4. |
+| MONTH(date) | Equivalent to EXTRACT(MONTH FROM date). Returns an integer between 1 and 12. |
+| WEEK(date) | Equivalent to EXTRACT(WEEK FROM date). Returns an integer between 1 and 53. |
+| DAYOFYEAR(date) | Equivalent to EXTRACT(DOY FROM date). Returns an integer between 1 and 366. |
+| DAYOFMONTH(date) | Equivalent to EXTRACT(DAY FROM date). Returns an integer between 1 and 31. |
+| DAYOFWEEK(date) | Equivalent to EXTRACT(DOW FROM date). Returns an integer between 1 and 7. |
+| HOUR(date) | Equivalent to EXTRACT(HOUR FROM date). Returns an integer between 0 and 23. |
+| MINUTE(date) | Equivalent to EXTRACT(MINUTE FROM date). Returns an integer between 0 and 59. |
+| SECOND(date) | Equivalent to EXTRACT(SECOND FROM date). Returns an integer between 0 and 59. |
+{:.table}
+
+**String functions**
+
+| Operator syntax | Description |
+| ---- | ---- |
+| string \|\| string | Concatenates two character strings |
+| CHAR_LENGTH(string) | Returns the number of characters in a character string |
+| CHARACTER_LENGTH(string) | As CHAR_LENGTH(string) |
+| UPPER(string) | Returns a character string converted to upper case |
+| LOWER(string) | Returns a character string converted to lower case |
+| POSITION(string1 IN string2) | Returns the position of the first occurrence of string1 in string2 |
+| POSITION(string1 IN string2 FROM integer) | Returns the position of the first occurrence of string1 in string2 starting at a given point (not standard SQL) |
+| TRIM( { BOTH \| LEADING \| TRAILING } string1 FROM string2) | Removes the longest string containing only the characters in string1 from the start/end/both ends of string1 |
+| OVERLAY(string1 PLACING string2 FROM integer [ FOR integer2 ]) | Replaces a substring of string1 with string2 |
+| SUBSTRING(string FROM integer) | Returns a substring of a character string starting at a given point |
+| SUBSTRING(string FROM integer FOR integer) | Returns a substring of a character string starting at a given point with a given length |
+| INITCAP(string) | Returns string with the first letter of each word converter to upper case and the rest to lower case. Words are sequences of alphanumeric characters separated by non-alphanumeric characters. |
+{:.table}
+
+**Conditional functions**
+
+| Operator syntax | Description |
+| ---- | ---- |
+| CASE value <br>WHEN value1 [, value11 ]* THEN result1 <br>[ WHEN valueN [, valueN1 ]* THEN resultN ]* <br>[ ELSE resultZ ] <br>END | Simple case |
+| CASE <br>WHEN condition1 THEN result1 <br>[ WHEN conditionN THEN resultN ]* <br>[ ELSE resultZ ] <br>END | Searched case |
+| NULLIF(value, value) | Returns NULL if the values are the same. For example, NULLIF(5, 5) returns NULL; NULLIF(5, 0) returns 5. |
+| COALESCE(value, value [, value ]*) | Provides a value if the first value is null. For example, COALESCE(NULL, 5) returns 5. |
+{:.table}
+
+**Type conversion functions**
+
+**Aggregate functions**
+
+| Operator syntax | Description |
+| ---- | ---- |
+| COUNT(*) | Returns the number of input rows |
+| AVG(numeric) | Returns the average (arithmetic mean) of numeric across all input values |
+| SUM(numeric) | Returns the sum of numeric across all input values |
+| MAX(value) | Returns the maximum value of value across all input values |
+| MIN(value) | Returns the minimum value of value across all input values |
+{:.table}
diff --git a/src/images/beam_sql_dsl_workflow.png b/src/images/beam_sql_dsl_workflow.png
new file mode 100644
index 0000000..dcd3909
Binary files /dev/null and b/src/images/beam_sql_dsl_workflow.png differ

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" <co...@beam.apache.org>.

[beam-site] 10/10: This closes #268

Posted by me...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 0c06b022bc2905511aa6c28c000fe39608c47e91
Merge: 4c5cf9a 292e22c
Author: Mergebot <me...@apache.org>
AuthorDate: Mon Aug 21 18:49:16 2017 +0000

    This closes #268

 src/_includes/header.html            |   4 +
 src/documentation/dsls/sql.md        | 381 +++++++++++++++++++++++++++++++++++
 src/images/beam_sql_dsl_workflow.png | Bin 0 -> 73461 bytes
 3 files changed, 385 insertions(+)

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" <co...@beam.apache.org>.

[beam-site] 03/10: add intro of BeamSqlRow, trigger settings of aggregation/join,

Posted by me...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit c8ae4c444af7ee37d1ce00383e1ef322ffbc8bb9
Author: mingmxu <mi...@ebay.com>
AuthorDate: Tue Jul 18 16:45:06 2017 -0700

    add intro of BeamSqlRow, trigger settings of aggregation/join,
    
    other misc spelling correction
---
 src/documentation/dsls/sql.md | 67 +++++++++++++++++++++++++++++++------------
 1 file changed, 49 insertions(+), 18 deletions(-)

diff --git a/src/documentation/dsls/sql.md b/src/documentation/dsls/sql.md
index c9cb0f7..8834a93 100644
--- a/src/documentation/dsls/sql.md
+++ b/src/documentation/dsls/sql.md
@@ -9,13 +9,14 @@ permalink: /documentation/dsls/sql/
 * [3. Usage of DSL APIs](#usage)
 * [4. Functionality in Beam SQL](#functionality)
   * [4.1. Supported Features](#features)
-  * [4.2. Data Types](#data-type)
-  * [4.3. built-in SQL functions](#built-in-functions)
+  * [4.2. Intro of BeamSqlRow](#beamsqlrow)
+  * [4.3. Data Types](#data-type)
+  * [4.4. built-in SQL functions](#built-in-functions)
 
 This page describes the implementation of Beam SQL, and how to simplify a Beam pipeline with DSL APIs.
 
 # <a name="overview"></a>1. Overview
-SQL is a well-adopted standard to process data with concise syntax. With DSL APIs, now `PCollection`s can be queried with standard SQL statements, like a regular table. The DSL APIs leverages [Apache Calcite](http://calcite.apache.org/) to parse and optimize SQL queries, then translate into a composite Beam `PTransform`. In this way, both SQL and normal Beam `PTransform`s can be mixed in the same pipeline.
+SQL is a well-adopted standard to process data with concise syntax. With DSL APIs, now `PCollection`s can be queried with standard SQL statements, like a regular table. The DSL APIs leverage [Apache Calcite](http://calcite.apache.org/) to parse and optimize SQL queries, then translate into a composite Beam `PTransform`. In this way, both SQL and normal Beam `PTransform`s can be mixed in the same pipeline.
 
 # <a name="internal-of-sql"></a>2. The Internal of Beam SQL
 Figure 1 describes the back-end steps from a SQL statement to a Beam `PTransform`.
@@ -24,14 +25,14 @@ Figure 1 describes the back-end steps from a SQL statement to a Beam `PTransform
 
 **Figure 1** workflow of Beam SQL DSL
 
-Given a PCollection and the query as input, first of all the input PCollection is registered as a table in the schema repository. Then it's processed as:
+Given a `PCollection` and the query as input, first of all the input `PCollection` is registered as a table in the schema repository. Then it's processed as:
 
 1. SQL query is parsed according to grammar to generate a SQL Abstract Syntax Tree;
 2. Validate against table schema, and output a logical plan represented with relational algebras;
 3. Relational rules are applied to convert it to a physical plan, expressed with Beam components. An optimizer is optional to update the plan;
 4. Finally, the Beam physical plan is compiled as a composite `PTransform`;
 
-Here is an example to show a query that filters and projects from a input PCollection:
+Here is an example to show a query that filters and projects from a input `PCollection`:
 
 ```
 SELECT USER_ID, USER_NAME FROM PCOLLECTION WHERE USER_ID = 1
@@ -55,7 +56,7 @@ pCollection.apply(BeamSqlFilter...)
 # <a name="usage"></a>3. Usage of DSL APIs 
 The DSL interface (`BeamSql.query()` and `BeamSql.simpleQuery()`), is the only endpoint exposed to developers. It wraps the back-end details of parsing/validation/assembling, to deliver a Beam SDK style API that can take either simple TABLE_FILTER queries or complex queries containing JOIN/GROUP_BY etc. 
 
-*Note*, the two APIs are equivalent in functionality, `BeamSql.query()` applies on a `PCollectionTuple` with one or many input `PCollection`s, and `BeamSql.simpleQuery()` is a simplified API which applies on single PCollections.
+*Note*, the two APIs are equivalent in functionality, `BeamSql.query()` applies on a `PCollectionTuple` with one or many input `PCollection`s, and `BeamSql.simpleQuery()` is a simplified API which applies on single `PCollection`.
 
 [BeamSqlExample](https://github.com/apache/beam/blob/DSL_SQL/dsls/sql/src/main/java/org/apache/beam/dsls/sql/example/BeamSqlExample.java) in code repository shows the usage of both APIs:
 
@@ -77,10 +78,10 @@ PCollection<BeamSqlRow> outputStream2 =
 
 In Step 1, a `PCollection<BeamSqlRow>` is prepared as the source dataset. The work to generate a queriable `PCollection<BeamSqlRow>` is beyond the scope of Beam SQL DSL. 
 
-Step 2(Case 1) shows the way to run a query with `BeamSql.simpleQuery()`, be aware that the input PCollection is named with a fixed table name __PCOLLECTION__. Step 2(Case 2) is an example to run a query with `BeamSql.query()`. A Table name used in the query is specified when adding PCollection to `PCollectionTuple`. As each call of either `BeamSql.query()` or `BeamSql.simpleQuery()` has its own schema repository, developers need to include all PCollections that would be used in your query.
+Step 2(Case 1) shows the way to run a query with `BeamSql.simpleQuery()`, be aware that the input `PCollection` is named with a fixed table name __PCOLLECTION__. Step 2(Case 2) is an example to run a query with `BeamSql.query()`. A Table name used in the query is specified when adding `PCollection` to `PCollectionTuple`. As each call of either `BeamSql.query()` or `BeamSql.simpleQuery()` has its own schema repository, developers need to include all `PCollection`s that would be used in yo [...]
 
 # <a name="functionality"></a>4. Functionality in Beam SQL
-Just as the unified model for both bounded and unbounded data in Beam, SQL DSL provides the same functionalities for bounded and unbounded PCollection as well. 
+Just as the unified model for both bounded and unbounded data in Beam, SQL DSL provides the same functionalities for bounded and unbounded `PCollection` as well. 
 
 Note that, SQL support is not fully completed. Queries that include unsupported features would cause a UnsupportedOperationException.
 
@@ -106,7 +107,16 @@ SELECT f_int, COUNT(*) AS `size` FROM PCOLLECTION GROUP BY f_int, HOP(f_timestam
 SELECT f_int, COUNT(*) AS `size` FROM PCOLLECTION GROUP BY f_int, SESSION(f_timestamp, INTERVAL '5' MINUTE)
 ```
 
-Note: distinct aggregation is not supported yet.
+Note: 
+
+1. distinct aggregation is not supported yet.
+2. the default trigger is `Repeatedly.forever(AfterWatermark.pastEndOfWindow())`;
+3. when `time` field in `HOP(dateTime, slide, size [, time ])`/`TUMBLE(dateTime, interval [, time ])`/`SESSION(dateTime, interval [, time ])` is specified, a lateFiring trigger is added as 
+
+```
+Repeatedly.forever(AfterWatermark.pastEndOfWindow().withLateFirings(AfterProcessingTime
+        .pastFirstElementInPane().plusDelayOf(Duration.millis(delayTime.getTimeInMillis()))));
+```		
 
 **4. Join (inner, left_outer, right_outer);**
 
@@ -121,6 +131,7 @@ For case 1 and case 2, a standard join is utilized as long as the windowFn of th
 * Only equal-join is supported, CROSS JOIN is not supported;
 * FULL OUTER JOIN is not supported;
 * If it's a LEFT OUTER JOIN, the unbounded table should on the left side; If it's a RIGHT OUTER JOIN, the unbounded table should on the right side;
+* The trigger is inherented from upstream, which should be consistent;
 
 **5. built-in SQL functions**
 
@@ -136,8 +147,8 @@ A UDF can be any Java method that takes zero or more scalar fields, and return o
 /**
  * A example UDF for test.
  */
-public static class CubicInteger{
-  public static Integer cubic(Integer input){
+public static class CubicInteger implements BeamSqlUdf{
+  public static Integer eval(Integer input){
     return input * input * input;
   }
 }
@@ -146,7 +157,7 @@ public static class CubicInteger{
 String sql = "SELECT f_int, cubic(f_int) as cubicvalue FROM PCOLLECTION WHERE f_int = 2";
 PCollection<BeamSqlRow> result =
     input.apply("udfExample",
-        BeamSql.simpleQuery(sql).withUdf("cubic", CubicInteger.class, "cubic"));
+        BeamSql.simpleQuery(sql).withUdf("cubic", CubicInteger.class));
 ```
 
 **create and specify User Defined Aggregate Function (UDAF)**
@@ -168,17 +179,17 @@ public static class SquareSum extends BeamSqlUdaf<Integer, Integer, Integer> {
   public SquareSum() {
   }
   
-  // @Override
+  @Override
   public Integer init() {
     return 0;
   }
   
-  // @Override
+  @Override
   public Integer add(Integer accumulator, Integer input) {
     return accumulator + input * input;
   }
   
-  // @Override
+  @Override
   public Integer merge(Iterable<Integer> accumulators) {
     int v = 0;
     Iterator<Integer> ite = accumulators.iterator();
@@ -188,7 +199,7 @@ public static class SquareSum extends BeamSqlUdaf<Integer, Integer, Integer> {
     return v;
   }
 
-  // @Override
+  @Override
   public Integer result(Integer accumulator) {
     return accumulator;
   }
@@ -201,7 +212,27 @@ PCollection<BeamSqlRow> result =
         BeamSql.simpleQuery(sql).withUdaf("squaresum", SquareSum.class));
 ```
  
-## <a name="data-type"></a>4.2. Data Types
+## <a name="beamsqlrow"></a>4.2. Intro of BeamSqlRow
+`BeamSqlRow`, encoded/decoded by `BeamSqlRowCoder`, represents a single, implicitly structured data item in a Beam SQL compatible `PCollection`. Similar as _row_ in the context of relational database, each `BeamSqlRow` consists of named columns with fixed types(see [4.3. Data Types](#data-type)).
+
+A Beam SQL `PCollection` can be created from an external source, in-memory data or derive from another SQL query. For `PCollection`s from external source or in-memory data, it's required to specify coder explcitly; `PCollection` derived from SQL query has the coder set already. Below is one example:
+
+```
+//define the input row format
+List<String> fieldNames = Arrays.asList("c1", "c2", "c3");
+List<Integer> fieldTypes = Arrays.asList(Types.INTEGER, Types.VARCHAR, Types.DOUBLE);
+BeamSqlRecordType type = BeamSqlRecordType.create(fieldNames, fieldTypes);
+BeamSqlRow row = new BeamSqlRow(type);
+row.addField(0, 1);
+row.addField(1, "row");
+row.addField(2, 1.0);
+
+//create a source PCollection with Create.of();
+PCollection<BeamSqlRow> inputTable = PBegin.in(p).apply(Create.of(row)
+    .withCoder(new BeamSqlRowCoder(type)));
+```
+ 
+## <a name="data-type"></a>4.3. Data Types
 Each type in Beam SQL maps to a Java class to holds the value in `BeamSqlRow`. The following table lists the relation between SQL types and Java classes, which are supported in current repository:
 
 | SQL Type | Java class |
@@ -217,7 +248,7 @@ Each type in Beam SQL maps to a Java class to holds the value in `BeamSqlRow`. T
 | Types.TIMESTAMP | java.util.Date |
 {:.table}
 
-## <a name="built-in-functions"></a>4.3. built-in SQL functions
+## <a name="built-in-functions"></a>4.4. built-in SQL functions
 
 Beam SQL has implemented lots of build-in functions defined in [Apache Calcite](http://calcite.apache.org). The available functions are listed as below:
 

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" <co...@beam.apache.org>.

[beam-site] 06/10: split into 1 blog + 1 development doc.

Posted by me...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 581ec758058ff02070ed08611608c12f0a654dce
Author: mingmxu <mi...@ebay.com>
AuthorDate: Wed Aug 2 11:23:00 2017 -0700

    split into 1 blog + 1 development doc.
---
 src/_includes/header.html                          |   4 +
 src/_posts/2017-07-21-sql-dsl.md                   | 340 +--------------------
 .../dsls/sql.md}                                   |  14 +-
 3 files changed, 16 insertions(+), 342 deletions(-)

diff --git a/src/_includes/header.html b/src/_includes/header.html
index 3bd4ced..79e72b6 100644
--- a/src/_includes/header.html
+++ b/src/_includes/header.html
@@ -62,6 +62,10 @@
             <li><a href="{{ site.baseurl }}/documentation/runners/flink/">Apache Flink Runner</a></li>
             <li><a href="{{ site.baseurl }}/documentation/runners/spark/">Apache Spark Runner</a></li>
             <li><a href="{{ site.baseurl }}/documentation/runners/dataflow/">Cloud Dataflow Runner</a></li>
+
+            <li role="separator" class="divider"></li>
+            <li class="dropdown-header">DSLs</li>
+            <li><a href="{{ site.baseurl }}/documentation/dsls/sql/">SQL</a></li>
           </ul>
         </li>
         <li class="dropdown">
diff --git a/src/_posts/2017-07-21-sql-dsl.md b/src/_posts/2017-07-21-sql-dsl.md
index b21769c..f37c24a 100644
--- a/src/_posts/2017-07-21-sql-dsl.md
+++ b/src/_posts/2017-07-21-sql-dsl.md
@@ -8,60 +8,14 @@ authors:
   - mingmxu
 ---
 
-BeamSQL provides the capability to execute standard SQL queries using Beam Java SDK. DSL interface packages the backend parsing/validation/assembling features, and delivers a SDK style API to developers, to express a processing logic using SQL statements, from simple TABLE_FILTER, to complex queries containing JOIN/GROUP_BY etc.
+Beam SQL DSL provides the capability to execute standard SQL queries using Beam Java SDK. It packages the backend parsing/validation/assembling features, and delivers a SDK style API to developers, to express a processing logic using SQL statements, from simple TABLE_FILTER, to complex queries containing JOIN/GROUP_BY etc.
 
 <!--more-->
 
-* [1. Overview](#overview)
-* [2. The Internal of Beam SQL](#internal-of-sql)
-* [3. Usage of DSL APIs](#usage)
-* [4. Functionality in Beam SQL](#functionality)
-  * [4.1. Supported Features](#features)
-  * [4.2. Intro of BeamSqlRow](#beamsqlrow)
-  * [4.3. Data Types](#data-type)
-  * [4.4. built-in SQL functions](#built-in-functions)
-
-This page describes the implementation of Beam SQL, and how to simplify a Beam pipeline with DSL APIs.
-
 # <a name="overview"></a>1. Overview
 SQL is a well-adopted standard to process data with concise syntax. With DSL APIs, now `PCollection`s can be queried with standard SQL statements, like a regular table. The DSL APIs leverage [Apache Calcite](http://calcite.apache.org/) to parse and optimize SQL queries, then translate into a composite Beam `PTransform`. In this way, both SQL and normal Beam `PTransform`s can be mixed in the same pipeline.
 
-# <a name="internal-of-sql"></a>2. The Internal of Beam SQL
-Figure 1 describes the back-end steps from a SQL statement to a Beam `PTransform`.
-
-![Workflow of Beam SQL DSL]({{ "/images/beam_sql_dsl_workflow.png" | prepend: site.baseurl }} "workflow of Beam SQL DSL")
-
-**Figure 1** workflow of Beam SQL DSL
-
-Given a `PCollection` and the query as input, first of all the input `PCollection` is registered as a table in the schema repository. Then it's processed as:
-
-1. SQL query is parsed according to grammar to generate a SQL Abstract Syntax Tree;
-2. Validate against table schema, and output a logical plan represented with relational algebras;
-3. Relational rules are applied to convert it to a physical plan, expressed with Beam components. An optimizer is optional to update the plan;
-4. Finally, the Beam physical plan is compiled as a composite `PTransform`;
-
-Here is an example to show a query that filters and projects from an input `PCollection`:
-
-```
-SELECT USER_ID, USER_NAME FROM PCOLLECTION WHERE USER_ID = 1
-```
-
-The logical plan is shown as:
-
-```
-LogicalProject(USER_ID=[$0], USER_NAME=[$1])
-  LogicalFilter(condition=[=($0, 1)])
-    LogicalTableScan(table=[[PCOLLECTION]])
-```
-
-And compiled as a composite `PTransform`
-
-```
-pCollection.apply(BeamSqlFilter...)
-           .apply(BeamSqlProject...)
-```
-
-# <a name="usage"></a>3. Usage of DSL APIs 
+# <a name="usage"></a>2. Usage of DSL APIs 
 The DSL interface (`BeamSql.query()` and `BeamSql.simpleQuery()`), is the only endpoint exposed to developers. It wraps the back-end details of parsing/validation/assembling, to deliver a Beam SDK style API that can take either simple TABLE_FILTER queries or complex queries containing JOIN/GROUP_BY etc. 
 
 *Note*, the two APIs are equivalent in functionality, `BeamSql.query()` applies on a `PCollectionTuple` with one or many input `PCollection`s, and `BeamSql.simpleQuery()` is a simplified API which applies on single `PCollection`.
@@ -96,288 +50,12 @@ Note that, SQL support is not fully completed. Queries that include unsupported
 ## <a name="features"></a>4.1. Supported Features
 The following features are supported in current repository (this chapter will be updated continually).
 
-**1. filter clauses;**
-
-**2. data field projections;**
-
-**3. aggregations;**
-
-Beam SQL supports aggregation functions COUNT/SUM/MAX/MIN/AVG with group_by in global_window, fixed_window, sliding_window and session_window. A field with type `TIMESTAMP` is required to specify fixed_window/sliding_window/session window, which is used as event timestamp for rows. See below for several examples:
-
-```
-//fixed window, one hour in duration
-SELECT f_int, COUNT(*) AS `size` FROM PCOLLECTION GROUP BY f_int, TUMBLE(f_timestamp, INTERVAL '1' HOUR)
-
-//sliding window, one hour in duration and 30 minutes period
-SELECT f_int, COUNT(*) AS `size` FROM PCOLLECTION GROUP BY f_int, HOP(f_timestamp, INTERVAL '1' HOUR, INTERVAL '30' MINUTE)
-
-//session window, with 5 minutes gap duration
-SELECT f_int, COUNT(*) AS `size` FROM PCOLLECTION GROUP BY f_int, SESSION(f_timestamp, INTERVAL '5' MINUTE)
-```
-
-Note: 
-
-1. distinct aggregation is not supported yet.
-2. the default trigger is `Repeatedly.forever(AfterWatermark.pastEndOfWindow())`;
-3. when `time` field in `HOP(dateTime, slide, size [, time ])`/`TUMBLE(dateTime, interval [, time ])`/`SESSION(dateTime, interval [, time ])` is specified, a lateFiring trigger is added as 
-
-```
-Repeatedly.forever(AfterWatermark.pastEndOfWindow().withLateFirings(AfterProcessingTime
-        .pastFirstElementInPane().plusDelayOf(Duration.millis(delayTime.getTimeInMillis()))));
-```		
-
-**4. Join (inner, left_outer, right_outer);**
-
-The scenarios of join can be categorized into 3 cases:
-
-1. BoundedTable JOIN BoundedTable
-2. UnboundedTable JOIN UnboundedTable
-3. BoundedTable JOIN UnboundedTable
-
-For case 1 and case 2, a standard join is utilized as long as the windowFn of the both sides match. For case 3, sideInput is utilized to implement the join. So there are some constraints:
-
-* Only equal-join is supported, CROSS JOIN is not supported;
-* FULL OUTER JOIN is not supported;
-* If it's a LEFT OUTER JOIN, the unbounded table should on the left side; If it's a RIGHT OUTER JOIN, the unbounded table should on the right side;
-* The trigger is inherented from upstream, which should be consistent;
-
-**5. built-in SQL functions**
-
-**6. User Defined Function (UDF) and User Defined Aggregate Function (UDAF);**
-
-If the required function is not available, developers can register their own UDF(for scalar function) and UDAF(for aggregation function).
-
-**create and specify User Defined Function (UDF)**
-
-A UDF can be any Java method that takes zero or more scalar fields, and return one scalar value. Below is an example of UDF and how to use it in DSL:
-
-```
-/**
- * A example UDF for test.
- */
-public static class CubicInteger implements BeamSqlUdf{
-  public static Integer eval(Integer input){
-    return input * input * input;
-  }
-}
-
-// register and call in SQL
-String sql = "SELECT f_int, cubic(f_int) as cubicvalue FROM PCOLLECTION WHERE f_int = 2";
-PCollection<BeamSqlRow> result =
-    input.apply("udfExample",
-        BeamSql.simpleQuery(sql).withUdf("cubic", CubicInteger.class));
-```
-
-**create and specify User Defined Aggregate Function (UDAF)**
-
-A UDAF aggregates a set of grouped scalar values, and output a single scalar value. To create a UDAF function, it's required to extend `org.apache.beam.dsls.sql.schema.BeamSqlUdaf<InputT, AccumT, OutputT>`, which defines 4 methods to process an aggregation:
-
-1. init(), to create an initial accumulate value;
-2. add(), to apply a new value to existing accumulate value;
-3. merge(), to merge accumulate values from parallel operators;
-4. result(), to generate the final result from accumulate value;
-
-Here's an example of UDAF:
-
-```
-/**
- * UDAF for test, which returns the sum of square.
- */
-public static class SquareSum extends BeamSqlUdaf<Integer, Integer, Integer> {
-  public SquareSum() {
-  }
-  
-  @Override
-  public Integer init() {
-    return 0;
-  }
-  
-  @Override
-  public Integer add(Integer accumulator, Integer input) {
-    return accumulator + input * input;
-  }
-  
-  @Override
-  public Integer merge(Iterable<Integer> accumulators) {
-    int v = 0;
-    Iterator<Integer> ite = accumulators.iterator();
-    while (ite.hasNext()) {
-      v += ite.next();
-    }
-    return v;
-  }
-
-  @Override
-  public Integer result(Integer accumulator) {
-    return accumulator;
-  }
-}
-
-//register and call in SQL
-String sql = "SELECT f_int1, squaresum(f_int2) AS `squaresum` FROM PCOLLECTION GROUP BY f_int2";
-PCollection<BeamSqlRow> result =
-    input.apply("udafExample",
-        BeamSql.simpleQuery(sql).withUdaf("squaresum", SquareSum.class));
-```
- 
-## <a name="beamsqlrow"></a>4.2. Intro of BeamSqlRow
-`BeamSqlRow`, encoded/decoded by `BeamSqlRowCoder`, represents a single, implicitly structured data item in a Beam SQL compatible `PCollection`. Similar as _row_ in the context of relational database, each `BeamSqlRow` consists of named columns with fixed types(see [4.3. Data Types](#data-type)).
-
-A Beam SQL `PCollection` can be created from an external source, in-memory data or derive from another SQL query. For `PCollection`s from external source or in-memory data, it's required to specify coder explcitly; `PCollection` derived from SQL query has the coder set already. Below is one example:
-
-```
-//define the input row format
-List<String> fieldNames = Arrays.asList("c1", "c2", "c3");
-List<Integer> fieldTypes = Arrays.asList(Types.INTEGER, Types.VARCHAR, Types.DOUBLE);
-BeamSqlRecordType type = BeamSqlRecordType.create(fieldNames, fieldTypes);
-BeamSqlRow row = new BeamSqlRow(type);
-row.addField(0, 1);
-row.addField(1, "row");
-row.addField(2, 1.0);
-
-//create a source PCollection with Create.of();
-PCollection<BeamSqlRow> inputTable = PBegin.in(p).apply(Create.of(row)
-    .withCoder(new BeamSqlRowCoder(type)));
-```
- 
-## <a name="data-type"></a>4.3. Data Types
-Each type in Beam SQL maps to a Java class to holds the value in `BeamSqlRow`. The following table lists the relation between SQL types and Java classes, which are supported in current repository:
-
-| SQL Type | Java class |
-| ---- | ---- |
-| Types.INTEGER | java.lang.Integer |
-| Types.SMALLINT | java.lang.Short |
-| Types.TINYINT | java.lang.Byte |
-| Types.BIGINT | java.lang.Long |
-| Types.FLOAT | java.lang.Float |
-| Types.DOUBLE | java.lang.Double |
-| Types.DECIMAL | java.math.BigDecimal |
-| Types.VARCHAR | java.lang.String |
-| Types.TIMESTAMP | java.util.Date |
-{:.table}
-
-## <a name="built-in-functions"></a>4.4. built-in SQL functions
-
-Beam SQL has implemented lots of build-in functions defined in [Apache Calcite](http://calcite.apache.org). The available functions are listed as below:
-
-**Comparison functions and operators**
-
-| Operator syntax | Description |
-| ---- | ---- |
-| value1 = value2 | Equals |
-| value1 <> value2 | Not equal |
-| value1 > value2 | Greater than |
-| value1 >= value2 | Greater than or equal |
-| value1 < value2 | Less than |
-| value1 <= value2 | Less than or equal |
-| value IS NULL | Whether value is null |
-| value IS NOT NULL | Whether value is not null |
-{:.table}
-
-**Logical functions and operators**
-
-| Operator syntax | Description |
-| ---- | ---- |
-| boolean1 OR boolean2 | Whether boolean1 is TRUE or boolean2 is TRUE |
-| boolean1 AND boolean2 | Whether boolean1 and boolean2 are both TRUE |
-| NOT boolean | Whether boolean is not TRUE; returns UNKNOWN if boolean is UNKNOWN |
-{:.table}
-
-**Arithmetic functions and operators**
-
-| Operator syntax | Description|
-| ---- | ---- |
-| numeric1 + numeric2 | Returns numeric1 plus numeric2| 
-| numeric1 - numeric2 | Returns numeric1 minus numeric2| 
-| numeric1 * numeric2 | Returns numeric1 multiplied by numeric2| 
-| numeric1 / numeric2 | Returns numeric1 divided by numeric2| 
-| MOD(numeric, numeric) | Returns the remainder (modulus) of numeric1 divided by numeric2. The result is negative only if numeric1 is negative| 
-{:.table}
-
-**Math functions**
-
-| Operator syntax | Description |
-| ---- | ---- |
-| ABS(numeric) | Returns the absolute value of numeric |
-| SQRT(numeric) | Returns the square root of numeric |
-| LN(numeric) | Returns the natural logarithm (base e) of numeric |
-| LOG10(numeric) | Returns the base 10 logarithm of numeric |
-| EXP(numeric) | Returns e raised to the power of numeric |
-| ACOS(numeric) | Returns the arc cosine of numeric |
-| ASIN(numeric) | Returns the arc sine of numeric |
-| ATAN(numeric) | Returns the arc tangent of numeric |
-| COT(numeric) | Returns the cotangent of numeric |
-| DEGREES(numeric) | Converts numeric from radians to degrees |
-| RADIANS(numeric) | Converts numeric from degrees to radians |
-| SIGN(numeric) | Returns the signum of numeric |
-| SIN(numeric) | Returns the sine of numeric |
-| TAN(numeric) | Returns the tangent of numeric |
-| ROUND(numeric1, numeric2) | Rounds numeric1 to numeric2 places right to the decimal point |
-{:.table}
-
-**Date functions**
-
-| Operator syntax | Description |
-| ---- | ---- |
-| LOCALTIME | Returns the current date and time in the session time zone in a value of datatype TIME |
-| LOCALTIME(precision) | Returns the current date and time in the session time zone in a value of datatype TIME, with precision digits of precision |
-| LOCALTIMESTAMP | Returns the current date and time in the session time zone in a value of datatype TIMESTAMP |
-| LOCALTIMESTAMP(precision) | Returns the current date and time in the session time zone in a value of datatype TIMESTAMP, with precision digits of precision |
-| CURRENT_TIME | Returns the current time in the session time zone, in a value of datatype TIMESTAMP WITH TIME ZONE |
-| CURRENT_DATE | Returns the current date in the session time zone, in a value of datatype DATE |
-| CURRENT_TIMESTAMP | Returns the current date and time in the session time zone, in a value of datatype TIMESTAMP WITH TIME ZONE |
-| EXTRACT(timeUnit FROM datetime) | Extracts and returns the value of a specified datetime field from a datetime value expression |
-| FLOOR(datetime TO timeUnit) | Rounds datetime down to timeUnit |
-| CEIL(datetime TO timeUnit) | Rounds datetime up to timeUnit |
-| YEAR(date) | Equivalent to EXTRACT(YEAR FROM date). Returns an integer. |
-| QUARTER(date) | Equivalent to EXTRACT(QUARTER FROM date). Returns an integer between 1 and 4. |
-| MONTH(date) | Equivalent to EXTRACT(MONTH FROM date). Returns an integer between 1 and 12. |
-| WEEK(date) | Equivalent to EXTRACT(WEEK FROM date). Returns an integer between 1 and 53. |
-| DAYOFYEAR(date) | Equivalent to EXTRACT(DOY FROM date). Returns an integer between 1 and 366. |
-| DAYOFMONTH(date) | Equivalent to EXTRACT(DAY FROM date). Returns an integer between 1 and 31. |
-| DAYOFWEEK(date) | Equivalent to EXTRACT(DOW FROM date). Returns an integer between 1 and 7. |
-| HOUR(date) | Equivalent to EXTRACT(HOUR FROM date). Returns an integer between 0 and 23. |
-| MINUTE(date) | Equivalent to EXTRACT(MINUTE FROM date). Returns an integer between 0 and 59. |
-| SECOND(date) | Equivalent to EXTRACT(SECOND FROM date). Returns an integer between 0 and 59. |
-{:.table}
-
-**String functions**
-
-| Operator syntax | Description |
-| ---- | ---- |
-| string \|\| string | Concatenates two character strings |
-| CHAR_LENGTH(string) | Returns the number of characters in a character string |
-| CHARACTER_LENGTH(string) | As CHAR_LENGTH(string) |
-| UPPER(string) | Returns a character string converted to upper case |
-| LOWER(string) | Returns a character string converted to lower case |
-| POSITION(string1 IN string2) | Returns the position of the first occurrence of string1 in string2 |
-| POSITION(string1 IN string2 FROM integer) | Returns the position of the first occurrence of string1 in string2 starting at a given point (not standard SQL) |
-| TRIM( { BOTH \| LEADING \| TRAILING } string1 FROM string2) | Removes the longest string containing only the characters in string1 from the start/end/both ends of string1 |
-| OVERLAY(string1 PLACING string2 FROM integer [ FOR integer2 ]) | Replaces a substring of string1 with string2 |
-| SUBSTRING(string FROM integer) | Returns a substring of a character string starting at a given point |
-| SUBSTRING(string FROM integer FOR integer) | Returns a substring of a character string starting at a given point with a given length |
-| INITCAP(string) | Returns string with the first letter of each word converter to upper case and the rest to lower case. Words are sequences of alphanumeric characters separated by non-alphanumeric characters. |
-{:.table}
-
-**Conditional functions**
-
-| Operator syntax | Description |
-| ---- | ---- |
-| CASE value <br>WHEN value1 [, value11 ]* THEN result1 <br>[ WHEN valueN [, valueN1 ]* THEN resultN ]* <br>[ ELSE resultZ ] <br>END | Simple case |
-| CASE <br>WHEN condition1 THEN result1 <br>[ WHEN conditionN THEN resultN ]* <br>[ ELSE resultZ ] <br>END | Searched case |
-| NULLIF(value, value) | Returns NULL if the values are the same. For example, NULLIF(5, 5) returns NULL; NULLIF(5, 0) returns 5. |
-| COALESCE(value, value [, value ]*) | Provides a value if the first value is null. For example, COALESCE(NULL, 5) returns 5. |
-{:.table}
-
-**Type conversion functions**
+1. filter clauses;
+2. data field projections;
+3. aggregations (global_window, fixed_window, sliding_window, session_window);
+4. Join (inner, left_outer, right_outer);
+5. built-in SQL functions
+6. User Defined Function (UDF) and User Defined Aggregate Function (UDAF);
 
-**Aggregate functions**
+For more deep dive, please visit [DSLs: SQL]({{ site.baseurl }}/documentation/dsls/sql/)
 
-| Operator syntax | Description |
-| ---- | ---- |
-| COUNT(*) | Returns the number of input rows |
-| AVG(numeric) | Returns the average (arithmetic mean) of numeric across all input values |
-| SUM(numeric) | Returns the sum of numeric across all input values |
-| MAX(value) | Returns the maximum value of value across all input values |
-| MIN(value) | Returns the minimum value of value across all input values |
-{:.table}
diff --git a/src/_posts/2017-07-21-sql-dsl.md b/src/documentation/dsls/sql.md
similarity index 97%
copy from src/_posts/2017-07-21-sql-dsl.md
copy to src/documentation/dsls/sql.md
index b21769c..76574e5 100644
--- a/src/_posts/2017-07-21-sql-dsl.md
+++ b/src/documentation/dsls/sql.md
@@ -1,17 +1,9 @@
 ---
-layout: post
-title:  "Use Beam SQL DSL to build a pipeline"
-date:   2017-07-21 00:00:00 -0800
-excerpt_separator: <!--more-->
-categories: blog
-authors:
-  - mingmxu
+layout: default
+title: "DSLs: SQL"
+permalink: /documentation/dsls/sql/
 ---
 
-BeamSQL provides the capability to execute standard SQL queries using Beam Java SDK. DSL interface packages the backend parsing/validation/assembling features, and delivers a SDK style API to developers, to express a processing logic using SQL statements, from simple TABLE_FILTER, to complex queries containing JOIN/GROUP_BY etc.
-
-<!--more-->
-
 * [1. Overview](#overview)
 * [2. The Internal of Beam SQL](#internal-of-sql)
 * [3. Usage of DSL APIs](#usage)

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" <co...@beam.apache.org>.

[beam-site] 09/10: remove post page move internal imple to the end of page

Posted by me...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 292e22c1fc683128810a00ed937fdd2f73427234
Author: mingmxu <mi...@ebay.com>
AuthorDate: Mon Aug 21 09:16:36 2017 -0700

    remove post page
    move internal imple to the end of page
---
 src/_data/authors.yml            |  4 --
 src/_posts/2017-07-21-sql-dsl.md | 64 --------------------------
 src/documentation/dsls/sql.md    | 98 +++++++++++++++++++++-------------------
 3 files changed, 51 insertions(+), 115 deletions(-)

diff --git a/src/_data/authors.yml b/src/_data/authors.yml
index 1b76ddb..d8cd836 100644
--- a/src/_data/authors.yml
+++ b/src/_data/authors.yml
@@ -28,10 +28,6 @@ klk:
     name: Kenneth Knowles
     email: klk@apache.org
     twitter: KennKnowles
-mingmxu:
-    name: Mingmin Xu
-    email: mingmxus@gmail.com
-    twitter:
 robertwb:
     name: Robert Bradshaw
     email: robertwb@apache.org
diff --git a/src/_posts/2017-07-21-sql-dsl.md b/src/_posts/2017-07-21-sql-dsl.md
deleted file mode 100644
index 95df3f8..0000000
--- a/src/_posts/2017-07-21-sql-dsl.md
+++ /dev/null
@@ -1,64 +0,0 @@
----
-layout: post
-title:  "Use Beam SQL DSL to build a pipeline"
-date:   2017-07-21 00:00:00 -0800
-excerpt_separator: <!--more-->
-categories: blog
-authors:
-  - mingmxu
----
-
-Beam SQL DSL provides the capability to execute standard SQL queries using Beam Java SDK. It packages the backend parsing/validation/assembling features, and delivers a SDK style API to developers, to express a processing logic using SQL statements, from simple TABLE_FILTER, to complex queries containing JOIN/GROUP_BY etc.
-
-<!--more-->
-
-# <a name="overview"></a>1. Overview
-SQL is a well-adopted standard to process data with concise syntax. With DSL APIs, now `PCollection`s can be queried with standard SQL statements, like a regular table. The DSL APIs leverage [Apache Calcite](http://calcite.apache.org/) to parse and optimize SQL queries, then translate it into a _composite_ Beam `PTransform`. In this way, both SQL and normal Beam `PTransform`s can be mixed in one pipeline.
-
-# <a name="usage"></a>2. Usage of DSL APIs 
-`BeamSql` is the only interface(with two methods `BeamSql.query()` and `BeamSql.simpleQuery()`) for developers. It wraps the back-end details of parsing/validation/assembling, and deliver a Beam SDK style API that can take either simple TABLE_FILTER queries or complex queries containing JOIN/GROUP_BY etc. 
-
-*Note*, the two methods are equivalent in functionality, `BeamSql.query()` applies on a `PCollectionTuple` with one or many input `PCollection`s; `BeamSql.simpleQuery()` is a simplified API which applies on single `PCollection`.
-
-[BeamSqlExample](https://github.com/apache/beam/blob/DSL_SQL/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/example/BeamSqlExample.java) in code repository shows the usage of both APIs:
-
-```
-//Step 1. create a source PCollection with Create.of();
-BeamRecordSqlType type = BeamRecordSqlType.create(fieldNames, fieldTypes);
-...
-
-PCollection<BeamRecord> inputTable = PBegin.in(p).apply(Create.of(row...)
-    .withCoder(type.getRecordCoder()));
-
-//Step 2. (Case 1) run a simple SQL query over input PCollection with BeamSql.simpleQuery;
-PCollection<BeamRecord> outputStream = inputTable.apply(
-    BeamSql.simpleQuery("select c1, c2, c3 from PCOLLECTION where c1=1"));
-
-
-//Step 2. (Case 2) run the query with BeamSql.query over result PCollection of (case 1);
-PCollection<BeamRecord> outputStream2 =
-    PCollectionTuple.of(new TupleTag<BeamRecord>("CASE1_RESULT"), outputStream)
-        .apply(BeamSql.query("select c2, sum(c3) from CASE1_RESULT group by c2"));
-```
-
-In Step 1, a `PCollection<BeamRecord>` is prepared as the source dataset. The work to generate a queriable `PCollection<BeamRecord>` is beyond the scope of Beam SQL DSL. 
-
-Step 2(Case 1) shows the usage to run a query with `BeamSql.simpleQuery()`, be aware that the input `PCollection` is named with a fixed table name __PCOLLECTION__. Step 2(Case 2) is another example to run a query with `BeamSql.query()`. A Table name is specified when adding `PCollection` to `PCollectionTuple`. As each call of either `BeamSql.query()` or `BeamSql.simpleQuery()` has its own schema repository, developers need to include all `PCollection`s that would be used in your query.
-
-# <a name="functionality"></a>4. Functionality in Beam SQL
-Just as the unified model for both bounded and unbounded data in Beam, SQL DSL provides the same functionalities for bounded and unbounded `PCollection` as well. 
-
-Note that, SQL support is not fully completed. Queries that include unsupported features would cause an `UnsupportedOperationException`.
-
-## <a name="features"></a>4.1. Supported Features
-The following features are supported in current repository:
-
-1. filter clauses;
-2. data field projections;
-3. aggregations (global_window, fixed_window, sliding_window, session_window);
-4. Join (inner, left_outer, right_outer);
-5. built-in SQL functions
-6. User Defined Function (UDF) and User Defined Aggregate Function (UDAF);
-
-For more deep dive, please visit [DSLs: SQL]({{ site.baseurl }}/documentation/dsls/sql/)
-
diff --git a/src/documentation/dsls/sql.md b/src/documentation/dsls/sql.md
index bfcbd1c..4f25fe4 100644
--- a/src/documentation/dsls/sql.md
+++ b/src/documentation/dsls/sql.md
@@ -5,55 +5,23 @@ permalink: /documentation/dsls/sql/
 ---
 
 * [1. Overview](#overview)
-* [2. The Internal of Beam SQL](#internal-of-sql)
-* [3. Usage of DSL APIs](#usage)
-* [4. Functionality in Beam SQL](#functionality)
-  * [4.1. Supported Features](#features)
-  * [4.2. Intro of BeamSqlRow](#beamsqlrow)
-  * [4.3. Data Types](#data-type)
-  * [4.4. built-in SQL functions](#built-in-functions)
+* [2. Usage of DSL APIs](#usage)
+* [3. Functionality in Beam SQL](#functionality)
+  * [3.1. Supported Features](#features)
+  * [3.2. Intro of BeamSqlRow](#beamsqlrow)
+  * [3.3. Data Types](#data-type)
+  * [3.4. built-in SQL functions](#built-in-functions)
+* [4. The Internal of Beam SQL](#internal-of-sql)
 
 This page describes the implementation of Beam SQL, and how to simplify a Beam pipeline with DSL APIs.
 
+> Note, Beam SQL hasn't been merged to master branch yet(being developed with branch [DSL_SQL](https://github.com/apache/beam/tree/DSL_SQL)), but is coming soon.
+
 # <a name="overview"></a>1. Overview
 SQL is a well-adopted standard to process data with concise syntax. With DSL APIs, now `PCollection`s can be queried with standard SQL statements, like a regular table. The DSL APIs leverage [Apache Calcite](http://calcite.apache.org/) to parse and optimize SQL queries, then translate into a composite Beam `PTransform`. In this way, both SQL and normal Beam `PTransform`s can be mixed in the same pipeline.
 
-# <a name="internal-of-sql"></a>2. The Internal of Beam SQL
-Figure 1 describes the back-end steps from a SQL statement to a Beam `PTransform`.
-
-![Workflow of Beam SQL DSL]({{ "/images/beam_sql_dsl_workflow.png" | prepend: site.baseurl }} "workflow of Beam SQL DSL")
-
-**Figure 1** workflow of Beam SQL DSL
-
-Given a `PCollection` and the query as input, first of all the input `PCollection` is registered as a table in the schema repository. Then it's processed as:
-
-1. SQL query is parsed according to grammar to generate a SQL Abstract Syntax Tree;
-2. Validate against table schema, and output a logical plan represented with relational algebras;
-3. Relational rules are applied to convert it to a physical plan, expressed with Beam components. An optimizer is optional to update the plan;
-4. Finally, the Beam physical plan is compiled as a composite `PTransform`;
 
-Here is an example to show a query that filters and projects from an input `PCollection`:
-
-```
-SELECT USER_ID, USER_NAME FROM PCOLLECTION WHERE USER_ID = 1
-```
-
-The logical plan is shown as:
-
-```
-LogicalProject(USER_ID=[$0], USER_NAME=[$1])
-  LogicalFilter(condition=[=($0, 1)])
-    LogicalTableScan(table=[[PCOLLECTION]])
-```
-
-And compiled as a composite `PTransform`
-
-```
-pCollection.apply(BeamSqlFilter...)
-           .apply(BeamSqlProject...)
-```
-
-# <a name="usage"></a>3. Usage of DSL APIs 
+# <a name="usage"></a>2. Usage of DSL APIs 
 `BeamSql` is the only interface(with two methods `BeamSql.query()` and `BeamSql.simpleQuery()`) for developers. It wraps the back-end details of parsing/validation/assembling, and deliver a Beam SDK style API that can take either simple TABLE_FILTER queries or complex queries containing JOIN/GROUP_BY etc. 
 
 *Note*, the two methods are equivalent in functionality, `BeamSql.query()` applies on a `PCollectionTuple` with one or many input `PCollection`s; `BeamSql.simpleQuery()` is a simplified API which applies on single `PCollection`.
@@ -83,12 +51,12 @@ In Step 1, a `PCollection<BeamRecord>` is prepared as the source dataset. The wo
 
 Step 2(Case 1) shows the usage to run a query with `BeamSql.simpleQuery()`, be aware that the input `PCollection` is named with a fixed table name __PCOLLECTION__. Step 2(Case 2) is another example to run a query with `BeamSql.query()`. A Table name is specified when adding `PCollection` to `PCollectionTuple`. As each call of either `BeamSql.query()` or `BeamSql.simpleQuery()` has its own schema repository, developers need to include all `PCollection`s that would be used in your query.
 
-# <a name="functionality"></a>4. Functionality in Beam SQL
+# <a name="functionality"></a>3. Functionality in Beam SQL
 Just as the unified model for both bounded and unbounded data in Beam, SQL DSL provides the same functionalities for bounded and unbounded `PCollection` as well. 
 
 Note that, SQL support is not fully completed. Queries that include unsupported features would cause an `UnsupportedOperationException`.
 
-## <a name="features"></a>4.1. Supported Features
+## <a name="features"></a>3.1. Supported Features
 The following features are supported in current repository:
 
 **1. filter clauses;**
@@ -217,7 +185,7 @@ PCollection<BeamSqlRow> result =
         BeamSql.simpleQuery(sql).withUdaf("squaresum", new SquareSum()));
 ```
  
-## <a name="beamsqlrow"></a>4.2. Intro of BeamRecord
+## <a name="beamsqlrow"></a>3.2. Intro of BeamRecord
 `BeamRecord`, described by `BeamRecordType`(extended `BeamRecordSqlType` in Beam SQL) and encoded/decoded by `BeamRecordCoder`, represents a single, immutable row in a Beam SQL `PCollection`. Similar as _row_ in relational database, each `BeamRecord` consists of named columns with fixed types(see [4.3. Data Types](#data-type)).
 
 A Beam SQL `PCollection` can be created from an external source, in-memory data or derive from another SQL query. For `PCollection`s from external source or in-memory data, it's required to specify coder explcitly; `PCollection` derived from SQL query has the coder set already. Below is one example:
@@ -234,7 +202,7 @@ PCollection<BeamRecord> inputTable = PBegin.in(p).apply(Create.of(row)
     .withCoder(type.getRecordCoder()));
 ```
  
-## <a name="data-type"></a>4.3. Data Types
+## <a name="data-type"></a>3.3. Data Types
 Each type in Beam SQL maps to a Java class to holds the value in `BeamRecord`. The following table lists the relation between SQL types and Java classes, which are supported in current repository:
 
 | SQL Type | Java class |
@@ -250,7 +218,7 @@ Each type in Beam SQL maps to a Java class to holds the value in `BeamRecord`. T
 | Types.TIMESTAMP | java.util.Date |
 {:.table}
 
-## <a name="built-in-functions"></a>4.4. built-in SQL functions
+## <a name="built-in-functions"></a>3.4. built-in SQL functions
 
 Beam SQL has implemented lots of build-in functions defined in [Apache Calcite](http://calcite.apache.org). The available functions are listed as below:
 
@@ -375,3 +343,39 @@ Beam SQL has implemented lots of build-in functions defined in [Apache Calcite](
 | MAX(value) | Returns the maximum value of value across all input values |
 | MIN(value) | Returns the minimum value of value across all input values |
 {:.table}
+
+# <a name="internal-of-sql"></a>4. The Internal of Beam SQL
+Figure 1 describes the back-end steps from a SQL statement to a Beam `PTransform`.
+
+![Workflow of Beam SQL DSL]({{ "/images/beam_sql_dsl_workflow.png" | prepend: site.baseurl }} "workflow of Beam SQL DSL")
+
+**Figure 1** workflow of Beam SQL DSL
+
+Given a `PCollection` and the query as input, first of all the input `PCollection` is registered as a table in the schema repository. Then it's processed as:
+
+1. SQL query is parsed according to grammar to generate a SQL Abstract Syntax Tree;
+2. Validate against table schema, and output a logical plan represented with relational algebras;
+3. Relational rules are applied to convert it to a physical plan, expressed with Beam components. An optimizer is optional to update the plan;
+4. Finally, the Beam physical plan is compiled as a composite `PTransform`;
+
+Here is an example to show a query that filters and projects from an input `PCollection`:
+
+```
+SELECT USER_ID, USER_NAME FROM PCOLLECTION WHERE USER_ID = 1
+```
+
+The logical plan is shown as:
+
+```
+LogicalProject(USER_ID=[$0], USER_NAME=[$1])
+  LogicalFilter(condition=[=($0, 1)])
+    LogicalTableScan(table=[[PCOLLECTION]])
+```
+
+And compiled as a composite `PTransform`
+
+```
+pCollection.apply(BeamSqlFilter...)
+           .apply(BeamSqlProject...)
+```
+

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" <co...@beam.apache.org>.

[beam-site] 08/10: update to latest API/usages

Posted by me...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 2a52cb3af842addcb08e554d21671f9feb1d58a4
Author: mingmxu <mi...@ebay.com>
AuthorDate: Thu Aug 17 17:16:05 2017 -0700

    update to latest API/usages
---
 src/_posts/2017-07-21-sql-dsl.md |  33 +++++++------
 src/documentation/dsls/sql.md    | 102 ++++++++++++++++++++-------------------
 2 files changed, 70 insertions(+), 65 deletions(-)

diff --git a/src/_posts/2017-07-21-sql-dsl.md b/src/_posts/2017-07-21-sql-dsl.md
index 4aa97d1..95df3f8 100644
--- a/src/_posts/2017-07-21-sql-dsl.md
+++ b/src/_posts/2017-07-21-sql-dsl.md
@@ -13,42 +13,45 @@ Beam SQL DSL provides the capability to execute standard SQL queries using Beam
 <!--more-->
 
 # <a name="overview"></a>1. Overview
-SQL is a well-adopted standard to process data with concise syntax. With DSL APIs, now `PCollection`s can be queried with standard SQL statements, like a regular table. The DSL APIs leverage [Apache Calcite](http://calcite.apache.org/) to parse and optimize SQL queries, then translate into a composite Beam `PTransform`. In this way, both SQL and normal Beam `PTransform`s can be mixed in the same pipeline.
+SQL is a well-adopted standard to process data with concise syntax. With DSL APIs, now `PCollection`s can be queried with standard SQL statements, like a regular table. The DSL APIs leverage [Apache Calcite](http://calcite.apache.org/) to parse and optimize SQL queries, then translate it into a _composite_ Beam `PTransform`. In this way, both SQL and normal Beam `PTransform`s can be mixed in one pipeline.
 
 # <a name="usage"></a>2. Usage of DSL APIs 
-The DSL interface (`BeamSql.query()` and `BeamSql.simpleQuery()`), is the only endpoint exposed to developers. It wraps the back-end details of parsing/validation/assembling, to deliver a Beam SDK style API that can take either simple TABLE_FILTER queries or complex queries containing JOIN/GROUP_BY etc. 
+`BeamSql` is the only interface(with two methods `BeamSql.query()` and `BeamSql.simpleQuery()`) for developers. It wraps the back-end details of parsing/validation/assembling, and deliver a Beam SDK style API that can take either simple TABLE_FILTER queries or complex queries containing JOIN/GROUP_BY etc. 
 
-*Note*, the two APIs are equivalent in functionality, `BeamSql.query()` applies on a `PCollectionTuple` with one or many input `PCollection`s, and `BeamSql.simpleQuery()` is a simplified API which applies on single `PCollection`.
+*Note*, the two methods are equivalent in functionality, `BeamSql.query()` applies on a `PCollectionTuple` with one or many input `PCollection`s; `BeamSql.simpleQuery()` is a simplified API which applies on single `PCollection`.
 
 [BeamSqlExample](https://github.com/apache/beam/blob/DSL_SQL/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/example/BeamSqlExample.java) in code repository shows the usage of both APIs:
 
 ```
 //Step 1. create a source PCollection with Create.of();
-PCollection<BeamSqlRow> inputTable = PBegin.in(p).apply(Create.of(row)
-    .withCoder(new BeamSqlRowCoder(type)));
+BeamRecordSqlType type = BeamRecordSqlType.create(fieldNames, fieldTypes);
+...
+
+PCollection<BeamRecord> inputTable = PBegin.in(p).apply(Create.of(row...)
+    .withCoder(type.getRecordCoder()));
 
 //Step 2. (Case 1) run a simple SQL query over input PCollection with BeamSql.simpleQuery;
-PCollection<BeamSqlRow> outputStream = inputTable.apply(
-    BeamSql.simpleQuery("select c2, c3 from PCOLLECTION where c1=1"));
+PCollection<BeamRecord> outputStream = inputTable.apply(
+    BeamSql.simpleQuery("select c1, c2, c3 from PCOLLECTION where c1=1"));
 
 
-//Step 2. (Case 2) run the query with BeamSql.query
-PCollection<BeamSqlRow> outputStream2 =
-    PCollectionTuple.of(new TupleTag<BeamSqlRow>("TABLE_B"), inputTable)
-        .apply(BeamSql.query("select c2, c3 from TABLE_B where c1=1"));
+//Step 2. (Case 2) run the query with BeamSql.query over result PCollection of (case 1);
+PCollection<BeamRecord> outputStream2 =
+    PCollectionTuple.of(new TupleTag<BeamRecord>("CASE1_RESULT"), outputStream)
+        .apply(BeamSql.query("select c2, sum(c3) from CASE1_RESULT group by c2"));
 ```
 
-In Step 1, a `PCollection<BeamSqlRow>` is prepared as the source dataset. The work to generate a queriable `PCollection<BeamSqlRow>` is beyond the scope of Beam SQL DSL. 
+In Step 1, a `PCollection<BeamRecord>` is prepared as the source dataset. The work to generate a queriable `PCollection<BeamRecord>` is beyond the scope of Beam SQL DSL. 
 
-Step 2(Case 1) shows the way to run a query with `BeamSql.simpleQuery()`, be aware that the input `PCollection` is named with a fixed table name __PCOLLECTION__. Step 2(Case 2) is an example to run a query with `BeamSql.query()`. A Table name used in the query is specified when adding `PCollection` to `PCollectionTuple`. As each call of either `BeamSql.query()` or `BeamSql.simpleQuery()` has its own schema repository, developers need to include all `PCollection`s that would be used in yo [...]
+Step 2(Case 1) shows the usage to run a query with `BeamSql.simpleQuery()`, be aware that the input `PCollection` is named with a fixed table name __PCOLLECTION__. Step 2(Case 2) is another example to run a query with `BeamSql.query()`. A Table name is specified when adding `PCollection` to `PCollectionTuple`. As each call of either `BeamSql.query()` or `BeamSql.simpleQuery()` has its own schema repository, developers need to include all `PCollection`s that would be used in your query.
 
 # <a name="functionality"></a>4. Functionality in Beam SQL
 Just as the unified model for both bounded and unbounded data in Beam, SQL DSL provides the same functionalities for bounded and unbounded `PCollection` as well. 
 
-Note that, SQL support is not fully completed. Queries that include unsupported features would cause a UnsupportedOperationException.
+Note that, SQL support is not fully completed. Queries that include unsupported features would cause an `UnsupportedOperationException`.
 
 ## <a name="features"></a>4.1. Supported Features
-The following features are supported in current repository (this chapter will be updated continually).
+The following features are supported in current repository:
 
 1. filter clauses;
 2. data field projections;
diff --git a/src/documentation/dsls/sql.md b/src/documentation/dsls/sql.md
index 34f26ae..bfcbd1c 100644
--- a/src/documentation/dsls/sql.md
+++ b/src/documentation/dsls/sql.md
@@ -54,39 +54,42 @@ pCollection.apply(BeamSqlFilter...)
 ```
 
 # <a name="usage"></a>3. Usage of DSL APIs 
-The DSL interface (`BeamSql.query()` and `BeamSql.simpleQuery()`), is the only endpoint exposed to developers. It wraps the back-end details of parsing/validation/assembling, to deliver a Beam SDK style API that can take either simple TABLE_FILTER queries or complex queries containing JOIN/GROUP_BY etc. 
+`BeamSql` is the only interface(with two methods `BeamSql.query()` and `BeamSql.simpleQuery()`) for developers. It wraps the back-end details of parsing/validation/assembling, and deliver a Beam SDK style API that can take either simple TABLE_FILTER queries or complex queries containing JOIN/GROUP_BY etc. 
 
-*Note*, the two APIs are equivalent in functionality, `BeamSql.query()` applies on a `PCollectionTuple` with one or many input `PCollection`s, and `BeamSql.simpleQuery()` is a simplified API which applies on single `PCollection`.
+*Note*, the two methods are equivalent in functionality, `BeamSql.query()` applies on a `PCollectionTuple` with one or many input `PCollection`s; `BeamSql.simpleQuery()` is a simplified API which applies on single `PCollection`.
 
 [BeamSqlExample](https://github.com/apache/beam/blob/DSL_SQL/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/example/BeamSqlExample.java) in code repository shows the usage of both APIs:
 
 ```
 //Step 1. create a source PCollection with Create.of();
-PCollection<BeamSqlRow> inputTable = PBegin.in(p).apply(Create.of(row)
-    .withCoder(new BeamSqlRowCoder(type)));
+BeamRecordSqlType type = BeamRecordSqlType.create(fieldNames, fieldTypes);
+...
+
+PCollection<BeamRecord> inputTable = PBegin.in(p).apply(Create.of(row...)
+    .withCoder(type.getRecordCoder()));
 
 //Step 2. (Case 1) run a simple SQL query over input PCollection with BeamSql.simpleQuery;
-PCollection<BeamSqlRow> outputStream = inputTable.apply(
-    BeamSql.simpleQuery("select c2, c3 from PCOLLECTION where c1=1"));
+PCollection<BeamRecord> outputStream = inputTable.apply(
+    BeamSql.simpleQuery("select c1, c2, c3 from PCOLLECTION where c1=1"));
 
 
-//Step 2. (Case 2) run the query with BeamSql.query
-PCollection<BeamSqlRow> outputStream2 =
-    PCollectionTuple.of(new TupleTag<BeamSqlRow>("TABLE_B"), inputTable)
-        .apply(BeamSql.query("select c2, c3 from TABLE_B where c1=1"));
+//Step 2. (Case 2) run the query with BeamSql.query over result PCollection of (case 1);
+PCollection<BeamRecord> outputStream2 =
+    PCollectionTuple.of(new TupleTag<BeamRecord>("CASE1_RESULT"), outputStream)
+        .apply(BeamSql.query("select c2, sum(c3) from CASE1_RESULT group by c2"));
 ```
 
-In Step 1, a `PCollection<BeamSqlRow>` is prepared as the source dataset. The work to generate a queriable `PCollection<BeamSqlRow>` is beyond the scope of Beam SQL DSL. 
+In Step 1, a `PCollection<BeamRecord>` is prepared as the source dataset. The work to generate a queriable `PCollection<BeamRecord>` is beyond the scope of Beam SQL DSL. 
 
-Step 2(Case 1) shows the way to run a query with `BeamSql.simpleQuery()`, be aware that the input `PCollection` is named with a fixed table name __PCOLLECTION__. Step 2(Case 2) is an example to run a query with `BeamSql.query()`. A Table name used in the query is specified when adding `PCollection` to `PCollectionTuple`. As each call of either `BeamSql.query()` or `BeamSql.simpleQuery()` has its own schema repository, developers need to include all `PCollection`s that would be used in yo [...]
+Step 2(Case 1) shows the usage to run a query with `BeamSql.simpleQuery()`, be aware that the input `PCollection` is named with a fixed table name __PCOLLECTION__. Step 2(Case 2) is another example to run a query with `BeamSql.query()`. A Table name is specified when adding `PCollection` to `PCollectionTuple`. As each call of either `BeamSql.query()` or `BeamSql.simpleQuery()` has its own schema repository, developers need to include all `PCollection`s that would be used in your query.
 
 # <a name="functionality"></a>4. Functionality in Beam SQL
 Just as the unified model for both bounded and unbounded data in Beam, SQL DSL provides the same functionalities for bounded and unbounded `PCollection` as well. 
 
-Note that, SQL support is not fully completed. Queries that include unsupported features would cause a UnsupportedOperationException.
+Note that, SQL support is not fully completed. Queries that include unsupported features would cause an `UnsupportedOperationException`.
 
 ## <a name="features"></a>4.1. Supported Features
-The following features are supported in current repository (this chapter will be updated continually).
+The following features are supported in current repository:
 
 **1. filter clauses;**
 
@@ -94,7 +97,7 @@ The following features are supported in current repository (this chapter will be
 
 **3. aggregations;**
 
-Beam SQL supports aggregation functions COUNT/SUM/MAX/MIN/AVG with group_by in global_window, fixed_window, sliding_window and session_window. A field with type `TIMESTAMP` is required to specify fixed_window/sliding_window/session window, which is used as event timestamp for rows. See below for several examples:
+Beam SQL supports aggregation functions with group_by in global_window, fixed_window, sliding_window and session_window. A field with type `TIMESTAMP` is required to specify fixed_window/sliding_window/session_window. The field is used as event timestamp for rows. See below for several examples:
 
 ```
 //fixed window, one hour in duration
@@ -126,12 +129,12 @@ The scenarios of join can be categorized into 3 cases:
 2. UnboundedTable JOIN UnboundedTable
 3. BoundedTable JOIN UnboundedTable
 
-For case 1 and case 2, a standard join is utilized as long as the windowFn of the both sides match. For case 3, sideInput is utilized to implement the join. So there are some constraints:
+For case 1 and case 2, a standard join is utilized as long as the windowFn of the both sides match. For case 3, sideInput is utilized to implement the join. So far there are some constraints:
 
 * Only equal-join is supported, CROSS JOIN is not supported;
 * FULL OUTER JOIN is not supported;
 * If it's a LEFT OUTER JOIN, the unbounded table should on the left side; If it's a RIGHT OUTER JOIN, the unbounded table should on the right side;
-* The trigger is inherented from upstream, which should be consistent;
+* window/trigger is inherented from upstreams, which should be consistent;
 
 **5. built-in SQL functions**
 
@@ -141,7 +144,7 @@ If the required function is not available, developers can register their own UDF
 
 **create and specify User Defined Function (UDF)**
 
-A UDF can be any Java method that takes zero or more scalar fields, and return one scalar value. Below is an example of UDF and how to use it in DSL:
+A UDF can be 1) any Java method that takes zero or more scalar fields and return one scalar value, or 2) a `SerializableFunction`. Below is an example of UDF and how to use it in DSL:
 
 ```
 /**
@@ -153,44 +156,45 @@ public static class CubicInteger implements BeamSqlUdf{
   }
 }
 
+/**
+ * Another example UDF with {@link SerializableFunction}.
+ */
+public static class CubicIntegerFn implements SerializableFunction<Integer, Integer> {
+  @Override
+  public Integer apply(Integer input) {
+    return input * input * input;
+  }
+}
+
 // register and call in SQL
-String sql = "SELECT f_int, cubic(f_int) as cubicvalue FROM PCOLLECTION WHERE f_int = 2";
+String sql = "SELECT f_int, cubic1(f_int) as cubicvalue1, cubic2(f_int) as cubicvalue2 FROM PCOLLECTION WHERE f_int = 2";
 PCollection<BeamSqlRow> result =
     input.apply("udfExample",
-        BeamSql.simpleQuery(sql).withUdf("cubic", CubicInteger.class));
+        BeamSql.simpleQuery(sql).withUdf("cubic1", CubicInteger.class)
+		                        .withUdf("cubic2", new CubicIntegerFn()));
 ```
 
 **create and specify User Defined Aggregate Function (UDAF)**
 
-A UDAF aggregates a set of grouped scalar values, and output a single scalar value. To create a UDAF function, it's required to extend `org.apache.beam.dsls.sql.schema.BeamSqlUdaf<InputT, AccumT, OutputT>`, which defines 4 methods to process an aggregation:
-
-1. init(), to create an initial accumulate value;
-2. add(), to apply a new value to existing accumulate value;
-3. merge(), to merge accumulate values from parallel operators;
-4. result(), to generate the final result from accumulate value;
-
-Here's an example of UDAF:
+Beam SQL can accept a `CombineFn` as UDAF. Here's an example of UDAF:
 
 ```
 /**
- * UDAF for test, which returns the sum of square.
+ * UDAF(CombineFn) for test, which returns the sum of square.
  */
-public static class SquareSum extends BeamSqlUdaf<Integer, Integer, Integer> {
-  public SquareSum() {
-  }
-  
+public static class SquareSum extends CombineFn<Integer, Integer, Integer> {
   @Override
-  public Integer init() {
+  public Integer createAccumulator() {
     return 0;
   }
-  
+
   @Override
-  public Integer add(Integer accumulator, Integer input) {
+  public Integer addInput(Integer accumulator, Integer input) {
     return accumulator + input * input;
   }
-  
+
   @Override
-  public Integer merge(Iterable<Integer> accumulators) {
+  public Integer mergeAccumulators(Iterable<Integer> accumulators) {
     int v = 0;
     Iterator<Integer> ite = accumulators.iterator();
     while (ite.hasNext()) {
@@ -200,20 +204,21 @@ public static class SquareSum extends BeamSqlUdaf<Integer, Integer, Integer> {
   }
 
   @Override
-  public Integer result(Integer accumulator) {
+  public Integer extractOutput(Integer accumulator) {
     return accumulator;
   }
+
 }
 
 //register and call in SQL
 String sql = "SELECT f_int1, squaresum(f_int2) AS `squaresum` FROM PCOLLECTION GROUP BY f_int2";
 PCollection<BeamSqlRow> result =
     input.apply("udafExample",
-        BeamSql.simpleQuery(sql).withUdaf("squaresum", SquareSum.class));
+        BeamSql.simpleQuery(sql).withUdaf("squaresum", new SquareSum()));
 ```
  
-## <a name="beamsqlrow"></a>4.2. Intro of BeamSqlRow
-`BeamSqlRow`, encoded/decoded by `BeamSqlRowCoder`, represents a single, implicitly structured data item in a Beam SQL compatible `PCollection`. Similar as _row_ in the context of relational database, each `BeamSqlRow` consists of named columns with fixed types(see [4.3. Data Types](#data-type)).
+## <a name="beamsqlrow"></a>4.2. Intro of BeamRecord
+`BeamRecord`, described by `BeamRecordType`(extended `BeamRecordSqlType` in Beam SQL) and encoded/decoded by `BeamRecordCoder`, represents a single, immutable row in a Beam SQL `PCollection`. Similar as _row_ in relational database, each `BeamRecord` consists of named columns with fixed types(see [4.3. Data Types](#data-type)).
 
 A Beam SQL `PCollection` can be created from an external source, in-memory data or derive from another SQL query. For `PCollection`s from external source or in-memory data, it's required to specify coder explcitly; `PCollection` derived from SQL query has the coder set already. Below is one example:
 
@@ -221,19 +226,16 @@ A Beam SQL `PCollection` can be created from an external source, in-memory data
 //define the input row format
 List<String> fieldNames = Arrays.asList("c1", "c2", "c3");
 List<Integer> fieldTypes = Arrays.asList(Types.INTEGER, Types.VARCHAR, Types.DOUBLE);
-BeamSqlRecordType type = BeamSqlRecordType.create(fieldNames, fieldTypes);
-BeamSqlRow row = new BeamSqlRow(type);
-row.addField(0, 1);
-row.addField(1, "row");
-row.addField(2, 1.0);
+BeamRecordSqlType type = BeamRecordSqlType.create(fieldNames, fieldTypes);
+BeamRecord row = new BeamRecord(type, 1, "row", 1.0);
 
 //create a source PCollection with Create.of();
-PCollection<BeamSqlRow> inputTable = PBegin.in(p).apply(Create.of(row)
-    .withCoder(new BeamSqlRowCoder(type)));
+PCollection<BeamRecord> inputTable = PBegin.in(p).apply(Create.of(row)
+    .withCoder(type.getRecordCoder()));
 ```
  
 ## <a name="data-type"></a>4.3. Data Types
-Each type in Beam SQL maps to a Java class to holds the value in `BeamSqlRow`. The following table lists the relation between SQL types and Java classes, which are supported in current repository:
+Each type in Beam SQL maps to a Java class to holds the value in `BeamRecord`. The following table lists the relation between SQL types and Java classes, which are supported in current repository:
 
 | SQL Type | Java class |
 | ---- | ---- |

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" <co...@beam.apache.org>.

[beam-site] 05/10: change page dsls/sql.md into a blog post

Posted by me...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit ba662cd2ebfcead48f3628e76fa1a7eb8f101029
Author: mingmxu <mi...@ebay.com>
AuthorDate: Fri Jul 21 22:38:14 2017 -0700

    change page dsls/sql.md into a blog post
---
 src/_data/authors.yml                                      |  4 ++++
 src/_includes/header.html                                  |  4 ----
 .../dsls/sql.md => _posts/2017-07-21-sql-dsl.md}           | 14 +++++++++++---
 3 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/src/_data/authors.yml b/src/_data/authors.yml
index d8cd836..1b76ddb 100644
--- a/src/_data/authors.yml
+++ b/src/_data/authors.yml
@@ -28,6 +28,10 @@ klk:
     name: Kenneth Knowles
     email: klk@apache.org
     twitter: KennKnowles
+mingmxu:
+    name: Mingmin Xu
+    email: mingmxus@gmail.com
+    twitter:
 robertwb:
     name: Robert Bradshaw
     email: robertwb@apache.org
diff --git a/src/_includes/header.html b/src/_includes/header.html
index 79e72b6..3bd4ced 100644
--- a/src/_includes/header.html
+++ b/src/_includes/header.html
@@ -62,10 +62,6 @@
             <li><a href="{{ site.baseurl }}/documentation/runners/flink/">Apache Flink Runner</a></li>
             <li><a href="{{ site.baseurl }}/documentation/runners/spark/">Apache Spark Runner</a></li>
             <li><a href="{{ site.baseurl }}/documentation/runners/dataflow/">Cloud Dataflow Runner</a></li>
-
-            <li role="separator" class="divider"></li>
-            <li class="dropdown-header">DSLs</li>
-            <li><a href="{{ site.baseurl }}/documentation/dsls/sql/">SQL</a></li>
           </ul>
         </li>
         <li class="dropdown">
diff --git a/src/documentation/dsls/sql.md b/src/_posts/2017-07-21-sql-dsl.md
similarity index 97%
rename from src/documentation/dsls/sql.md
rename to src/_posts/2017-07-21-sql-dsl.md
index 76574e5..b21769c 100644
--- a/src/documentation/dsls/sql.md
+++ b/src/_posts/2017-07-21-sql-dsl.md
@@ -1,9 +1,17 @@
 ---
-layout: default
-title: "DSLs: SQL"
-permalink: /documentation/dsls/sql/
+layout: post
+title:  "Use Beam SQL DSL to build a pipeline"
+date:   2017-07-21 00:00:00 -0800
+excerpt_separator: <!--more-->
+categories: blog
+authors:
+  - mingmxu
 ---
 
+BeamSQL provides the capability to execute standard SQL queries using Beam Java SDK. DSL interface packages the backend parsing/validation/assembling features, and delivers a SDK style API to developers, to express a processing logic using SQL statements, from simple TABLE_FILTER, to complex queries containing JOIN/GROUP_BY etc.
+
+<!--more-->
+
 * [1. Overview](#overview)
 * [2. The Internal of Beam SQL](#internal-of-sql)
 * [3. Usage of DSL APIs](#usage)

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" <co...@beam.apache.org>.

[beam-site] 07/10: fix broken link

Posted by me...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit e3b4815687522ec54cfcdb906ae9a635fb2d023c
Author: mingmxu <mi...@ebay.com>
AuthorDate: Thu Aug 3 16:10:03 2017 -0700

    fix broken link
---
 src/_posts/2017-07-21-sql-dsl.md | 2 +-
 src/documentation/dsls/sql.md    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/_posts/2017-07-21-sql-dsl.md b/src/_posts/2017-07-21-sql-dsl.md
index f37c24a..4aa97d1 100644
--- a/src/_posts/2017-07-21-sql-dsl.md
+++ b/src/_posts/2017-07-21-sql-dsl.md
@@ -20,7 +20,7 @@ The DSL interface (`BeamSql.query()` and `BeamSql.simpleQuery()`), is the only e
 
 *Note*, the two APIs are equivalent in functionality, `BeamSql.query()` applies on a `PCollectionTuple` with one or many input `PCollection`s, and `BeamSql.simpleQuery()` is a simplified API which applies on single `PCollection`.
 
-[BeamSqlExample](https://github.com/apache/beam/blob/DSL_SQL/dsls/sql/src/main/java/org/apache/beam/dsls/sql/example/BeamSqlExample.java) in code repository shows the usage of both APIs:
+[BeamSqlExample](https://github.com/apache/beam/blob/DSL_SQL/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/example/BeamSqlExample.java) in code repository shows the usage of both APIs:
 
 ```
 //Step 1. create a source PCollection with Create.of();
diff --git a/src/documentation/dsls/sql.md b/src/documentation/dsls/sql.md
index 76574e5..34f26ae 100644
--- a/src/documentation/dsls/sql.md
+++ b/src/documentation/dsls/sql.md
@@ -58,7 +58,7 @@ The DSL interface (`BeamSql.query()` and `BeamSql.simpleQuery()`), is the only e
 
 *Note*, the two APIs are equivalent in functionality, `BeamSql.query()` applies on a `PCollectionTuple` with one or many input `PCollection`s, and `BeamSql.simpleQuery()` is a simplified API which applies on single `PCollection`.
 
-[BeamSqlExample](https://github.com/apache/beam/blob/DSL_SQL/dsls/sql/src/main/java/org/apache/beam/dsls/sql/example/BeamSqlExample.java) in code repository shows the usage of both APIs:
+[BeamSqlExample](https://github.com/apache/beam/blob/DSL_SQL/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/example/BeamSqlExample.java) in code repository shows the usage of both APIs:
 
 ```
 //Step 1. create a source PCollection with Create.of();

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" <co...@beam.apache.org>.