You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/04/06 18:09:27 UTC

[GitHub] [beam] apilloud commented on a diff in pull request #17298: Minor: Prefer registered schema in SQL docs

apilloud commented on code in PR #17298:
URL: https://github.com/apache/beam/pull/17298#discussion_r844238373


##########
website/www/site/content/en/documentation/dsls/sql/walkthrough.md:
##########
@@ -20,18 +20,15 @@ limitations under the License.
 
 This page illustrates the usage of Beam SQL with example code.
 
-## Row
+## Beam Schemas and Rows
 
-Before applying a SQL query to a `PCollection`, the data in the collection must
-be in `Row` format. A `Row` represents a single, immutable record in a Beam SQL
-`PCollection`. The names and types of the fields/columns in the row are defined
-by its associated [Schema](https://beam.apache.org/releases/javadoc/{{< param release_latest >}}/index.html?org/apache/beam/sdk/schemas/Schema.html).
-You can use the [Schema.builder()](https://beam.apache.org/releases/javadoc/{{< param release_latest >}}/index.html?org/apache/beam/sdk/schemas/Schema.html) to create
-`Schemas`. See [Data
-Types](/documentation/dsls/sql/data-types) for more details on supported primitive data types.
+A SQL query can only be applied to a a `PCollection<T>`
+where `T` has a schema registered (preferred), or a `PCollection<Row>`. See the

Review Comment:
   I prefer `PCollection<Row>` but I think both have use cases where they are better. Can you drop the (preferred)?



##########
website/www/site/content/en/documentation/dsls/sql/walkthrough.md:
##########
@@ -20,18 +20,15 @@ limitations under the License.
 
 This page illustrates the usage of Beam SQL with example code.
 
-## Row
+## Beam Schemas and Rows
 
-Before applying a SQL query to a `PCollection`, the data in the collection must
-be in `Row` format. A `Row` represents a single, immutable record in a Beam SQL
-`PCollection`. The names and types of the fields/columns in the row are defined
-by its associated [Schema](https://beam.apache.org/releases/javadoc/{{< param release_latest >}}/index.html?org/apache/beam/sdk/schemas/Schema.html).
-You can use the [Schema.builder()](https://beam.apache.org/releases/javadoc/{{< param release_latest >}}/index.html?org/apache/beam/sdk/schemas/Schema.html) to create
-`Schemas`. See [Data
-Types](/documentation/dsls/sql/data-types) for more details on supported primitive data types.
+A SQL query can only be applied to a a `PCollection<T>`
+where `T` has a schema registered (preferred), or a `PCollection<Row>`. See the
+[schema documentation](/documentation/programming-guide/#what-is-a-schema) in
+the Beam Programming Guide for details on registering a schema for a type `T`.
 
-
-A `PCollection<Row>` can be obtained multiple ways, for example:
+If you'd prefer to work with `Row` directly, a `PCollection<Row>` can be

Review Comment:
   How about this instead: "If you don't have an existing type, a PCollection<Row> can be"... (or less opinionated, "fixed type" instead of "existing type").



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org