You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/05/27 03:29:11 UTC

[GitHub] [beam] epicfaace commented on a change in pull request #11802: [BEAM-9916] Update I/O documentation links and create more complete I/O matrix

epicfaace commented on a change in pull request #11802:
URL: https://github.com/apache/beam/pull/11802#discussion_r430709836



##########
File path: website/www/site/data/io_matrix.yaml
##########
@@ -0,0 +1,377 @@
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+categories:
+  - name: File-based
+    description: These I/O connectors involve working with files.
+    rows:
+      - transform: FileIO
+        description: "General-purpose transforms for working with files: listing files (matching), reading and writing."
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.FileIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/FileIO.html
+          - language: py
+            name: apache_beam.io.FileIO
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.fileio.html
+      - transform: AvroIO
+        description: PTransforms for reading from and writing to [Avro](https://avro.apache.org/) files.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.AvroIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/AvroIO.html
+          - language: py
+            name: apache_beam.io.avroio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.avroio.html
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/avroio
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/avroio
+      - transform: TextIO
+        description: PTransforms for reading and writing text files.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.TextIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/TextIO.html
+          - language: py
+            name: apache_beam.io.textio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.textio.html
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/textio
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/textio
+      - transform: TFRecordIO
+        description: PTransforms for reading and writing [TensorFlow TFRecord](https://www.tensorflow.org/tutorials/load_data/tfrecord) files.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.TFRecordIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/TFRecordIO.html
+          - language: py
+            name: apache_beam.io.tfrecordio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.tfrecordio.html
+      - transform: XmlIO
+        description: Transforms for reading and writing XML files using [JAXB](https://www.oracle.com/technical-resources/articles/javase/jaxb.html) mappers.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.xml.XmlIO
+            url: https://beam.apache.org/releases/javadoc/2.3.0/org/apache/beam/sdk/io/xml/XmlIO.html
+      - transform: TikaIO
+        description: Transforms for parsing arbitrary files using [Apache Tika](https://tika.apache.org/).
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.tika.TikaIO
+            url: https://beam.apache.org/releases/javadoc/2.3.0/org/apache/beam/sdk/io/tika/TikaIO.html
+      - transform: ParquetIO
+        description: IO for reading from and writing to [Parquet](https://parquet.apache.org/) files.
+        docs: /documentation/io/built-in/parquet/
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.parquet.ParquetIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/parquet/ParquetIO.html
+          - language: py
+            name: apache_beam.io.parquetio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.parquetio.html
+      - transform: ThriftIO
+        description: PTransforms for reading and writing files containing [Thrift](https://thrift.apache.org/)-encoded data.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.thrift.ThriftIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/thrift/ThriftIO.html
+      - transform: VcfIO
+        description: A source for reading from [VCF files](https://samtools.github.io/hts-specs/VCFv4.2.pdf) (version 4.x).
+        implementations:
+          - language: py
+            name: apache_beam.io.vcfio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.vcfio.html
+      - transform: S3IO
+        description: A source for reading from and writing to [Amazon S3](https://aws.amazon.com/s3/).
+        implementations:
+          - language: py
+            name: apache_beam.io.aws.s3io
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.aws.s3io.html
+      - transform: GcsIO
+        description: A source for reading from and writing to [Google Cloud Storage](https://cloud.google.com/storage).
+        implementations:
+          - language: py
+            name: apache_beam.io.gcp.gcsio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.gcp.gcsio.html
+  - name: FileSystem
+    description: Beam provides a File system interface that defines APIs for writing file systems agnostic code. Several I/O connectors are implemented as a FileSystem implementation.
+    rows:
+      - transform: HadoopFileSystem
+        description: "`FileSystem` implementation for accessing [Hadoop](https://hadoop.apache.org/) Distributed File System files."
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.hdfs.HadoopFileSystemRegistrar
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/hdfs/HadoopFileSystemRegistrar.html
+          - language: py
+            name: apache_beam.io.hadoopfilesystem
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.hadoopfilesystem.html
+      - transform: GcsFileSystem
+        description: "`FileSystem` implementation for [Google Cloud Storage](https://cloud.google.com/storage)."
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.extensions.gcp.storage.GcsFileSystemRegistrar
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/extensions/gcp/storage/GcsFileSystemRegistrar.html
+          - language: py
+            name: apache_beam.io.gcp.gcsfilesystem
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.gcp.gcsfilesystem.html
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/gcs
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/gcs
+      - transform: LocalFileSystem
+        description: "`FileSystem` implementation for accessing files on disk."
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.LocalFileSystemRegistrar
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/LocalFileSystemRegistrar.html
+          - language: py
+            name: apache_beam.io.localfilesystem
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.localfilesystem.html
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/local
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/local
+      - transform: S3FileSystem
+        description: "`FileSystem` implementation for [Amazon S3](https://aws.amazon.com/s3/)."
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.aws.s3.S3FileSystemRegistrar
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/hdfs/package-summary.html
+      - transform: In-memory
+        description: "`FileSystem` implementation in memory; useful for testing."
+        implementations:
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/memfs
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/memfs
+  - name: Messaging
+    description: These I/O connectors typically involve working with unbounded sources that come from messaging sources.
+    rows:
+      - transform: KinesisIO
+        description: PTransforms for reading from and writing to [Kinesis](https://aws.amazon.com/kinesis/) streams.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.kinesis.KinesisIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/kinesis/KinesisIO.html
+      - transform: AmqpIO
+        description: AMQP 1.0 protocol using the Apache QPid Proton-J library
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.amqp.AmqpIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/amqp/AmqpIO.html
+      - transform: KafkaIO
+        description: Read and Write PTransforms for [Apache Kafka](https://kafka.apache.org/).
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.kafka.KafkaIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/kafka/KafkaIO.html
+          - language: py
+            name: apache_beam.io.external.kafka
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.external.kafka.html
+      - transform: PubSubIO
+        description: Read and Write PTransforms for [Google Cloud Pub/Sub](https://cloud.google.com/pubsub) streams.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.gcp.pubsub.PubsubIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.html
+          - language: py
+            name: apache_beam.io.gcp.pubsub
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.gcp.pubsub.html
+          - language: py
+            name: apache_beam.io.external.gcp.pubsub
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.external.gcp.pubsub.html
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/pubsubio
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/pubsubio
+      - transform: JmsIO
+        description: An unbounded source for [JMS](https://www.oracle.com/java/technologies/java-message-service.html) destinations (queues or topics).
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.jms.JmsIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/jms/JmsIO.html
+      - transform: MqttIO
+        description: An unbounded source for [MQTT](https://mqtt.org/) broker.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.mqtt.MqttIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/mqtt/MqttIO.html
+      - transform: RabbitMqIO
+        description: A IO to publish or consume messages with a RabbitMQ broker.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.rabbitmq.RabbitMqIO
+            url: https://github.com/apache/beam/blob/master/sdks/java/io/rabbitmq/src/main/java/org/apache/beam/sdk/io/rabbitmq/RabbitMqIO.java

Review comment:
       I wasn't sure why javadoc for RabbitMqIO and KuduIO wasn't available, so I just linked to the GitHub files instead. Any idea @pabloem ?

##########
File path: website/www/site/data/io_matrix.yaml
##########
@@ -0,0 +1,377 @@
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+categories:
+  - name: File-based
+    description: These I/O connectors involve working with files.
+    rows:
+      - transform: FileIO
+        description: "General-purpose transforms for working with files: listing files (matching), reading and writing."
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.FileIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/FileIO.html
+          - language: py
+            name: apache_beam.io.FileIO
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.fileio.html
+      - transform: AvroIO
+        description: PTransforms for reading from and writing to [Avro](https://avro.apache.org/) files.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.AvroIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/AvroIO.html
+          - language: py
+            name: apache_beam.io.avroio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.avroio.html
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/avroio
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/avroio
+      - transform: TextIO
+        description: PTransforms for reading and writing text files.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.TextIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/TextIO.html
+          - language: py
+            name: apache_beam.io.textio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.textio.html
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/textio
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/textio
+      - transform: TFRecordIO
+        description: PTransforms for reading and writing [TensorFlow TFRecord](https://www.tensorflow.org/tutorials/load_data/tfrecord) files.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.TFRecordIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/TFRecordIO.html
+          - language: py
+            name: apache_beam.io.tfrecordio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.tfrecordio.html
+      - transform: XmlIO
+        description: Transforms for reading and writing XML files using [JAXB](https://www.oracle.com/technical-resources/articles/javase/jaxb.html) mappers.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.xml.XmlIO
+            url: https://beam.apache.org/releases/javadoc/2.3.0/org/apache/beam/sdk/io/xml/XmlIO.html
+      - transform: TikaIO
+        description: Transforms for parsing arbitrary files using [Apache Tika](https://tika.apache.org/).
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.tika.TikaIO
+            url: https://beam.apache.org/releases/javadoc/2.3.0/org/apache/beam/sdk/io/tika/TikaIO.html
+      - transform: ParquetIO
+        description: IO for reading from and writing to [Parquet](https://parquet.apache.org/) files.
+        docs: /documentation/io/built-in/parquet/
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.parquet.ParquetIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/parquet/ParquetIO.html
+          - language: py
+            name: apache_beam.io.parquetio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.parquetio.html
+      - transform: ThriftIO
+        description: PTransforms for reading and writing files containing [Thrift](https://thrift.apache.org/)-encoded data.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.thrift.ThriftIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/thrift/ThriftIO.html
+      - transform: VcfIO
+        description: A source for reading from [VCF files](https://samtools.github.io/hts-specs/VCFv4.2.pdf) (version 4.x).
+        implementations:
+          - language: py
+            name: apache_beam.io.vcfio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.vcfio.html
+      - transform: S3IO
+        description: A source for reading from and writing to [Amazon S3](https://aws.amazon.com/s3/).
+        implementations:
+          - language: py
+            name: apache_beam.io.aws.s3io
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.aws.s3io.html
+      - transform: GcsIO
+        description: A source for reading from and writing to [Google Cloud Storage](https://cloud.google.com/storage).
+        implementations:
+          - language: py
+            name: apache_beam.io.gcp.gcsio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.gcp.gcsio.html
+  - name: FileSystem
+    description: Beam provides a File system interface that defines APIs for writing file systems agnostic code. Several I/O connectors are implemented as a FileSystem implementation.
+    rows:
+      - transform: HadoopFileSystem
+        description: "`FileSystem` implementation for accessing [Hadoop](https://hadoop.apache.org/) Distributed File System files."
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.hdfs.HadoopFileSystemRegistrar
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/hdfs/HadoopFileSystemRegistrar.html
+          - language: py
+            name: apache_beam.io.hadoopfilesystem
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.hadoopfilesystem.html
+      - transform: GcsFileSystem
+        description: "`FileSystem` implementation for [Google Cloud Storage](https://cloud.google.com/storage)."
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.extensions.gcp.storage.GcsFileSystemRegistrar
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/extensions/gcp/storage/GcsFileSystemRegistrar.html
+          - language: py
+            name: apache_beam.io.gcp.gcsfilesystem
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.gcp.gcsfilesystem.html
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/gcs
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/gcs
+      - transform: LocalFileSystem
+        description: "`FileSystem` implementation for accessing files on disk."
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.LocalFileSystemRegistrar
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/LocalFileSystemRegistrar.html
+          - language: py
+            name: apache_beam.io.localfilesystem
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.localfilesystem.html
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/local
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/local
+      - transform: S3FileSystem
+        description: "`FileSystem` implementation for [Amazon S3](https://aws.amazon.com/s3/)."
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.aws.s3.S3FileSystemRegistrar
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/hdfs/package-summary.html
+      - transform: In-memory
+        description: "`FileSystem` implementation in memory; useful for testing."
+        implementations:
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/memfs
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/memfs
+  - name: Messaging
+    description: These I/O connectors typically involve working with unbounded sources that come from messaging sources.
+    rows:
+      - transform: KinesisIO
+        description: PTransforms for reading from and writing to [Kinesis](https://aws.amazon.com/kinesis/) streams.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.kinesis.KinesisIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/kinesis/KinesisIO.html
+      - transform: AmqpIO
+        description: AMQP 1.0 protocol using the Apache QPid Proton-J library
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.amqp.AmqpIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/amqp/AmqpIO.html
+      - transform: KafkaIO
+        description: Read and Write PTransforms for [Apache Kafka](https://kafka.apache.org/).
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.kafka.KafkaIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/kafka/KafkaIO.html
+          - language: py
+            name: apache_beam.io.external.kafka
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.external.kafka.html
+      - transform: PubSubIO
+        description: Read and Write PTransforms for [Google Cloud Pub/Sub](https://cloud.google.com/pubsub) streams.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.gcp.pubsub.PubsubIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.html
+          - language: py
+            name: apache_beam.io.gcp.pubsub
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.gcp.pubsub.html
+          - language: py
+            name: apache_beam.io.external.gcp.pubsub
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.external.gcp.pubsub.html
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/pubsubio
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/pubsubio
+      - transform: JmsIO
+        description: An unbounded source for [JMS](https://www.oracle.com/java/technologies/java-message-service.html) destinations (queues or topics).
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.jms.JmsIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/jms/JmsIO.html
+      - transform: MqttIO
+        description: An unbounded source for [MQTT](https://mqtt.org/) broker.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.mqtt.MqttIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/mqtt/MqttIO.html
+      - transform: RabbitMqIO
+        description: A IO to publish or consume messages with a RabbitMQ broker.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.rabbitmq.RabbitMqIO
+            url: https://github.com/apache/beam/blob/master/sdks/java/io/rabbitmq/src/main/java/org/apache/beam/sdk/io/rabbitmq/RabbitMqIO.java

Review comment:
       https://issues.apache.org/jira/browse/BEAM-10098

##########
File path: website/www/site/data/io_matrix.yaml
##########
@@ -0,0 +1,377 @@
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+categories:
+  - name: File-based
+    description: These I/O connectors involve working with files.
+    rows:
+      - transform: FileIO
+        description: "General-purpose transforms for working with files: listing files (matching), reading and writing."
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.FileIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/FileIO.html
+          - language: py
+            name: apache_beam.io.FileIO
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.fileio.html
+      - transform: AvroIO
+        description: PTransforms for reading from and writing to [Avro](https://avro.apache.org/) files.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.AvroIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/AvroIO.html
+          - language: py
+            name: apache_beam.io.avroio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.avroio.html
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/avroio
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/avroio
+      - transform: TextIO
+        description: PTransforms for reading and writing text files.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.TextIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/TextIO.html
+          - language: py
+            name: apache_beam.io.textio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.textio.html
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/textio
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/textio
+      - transform: TFRecordIO
+        description: PTransforms for reading and writing [TensorFlow TFRecord](https://www.tensorflow.org/tutorials/load_data/tfrecord) files.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.TFRecordIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/TFRecordIO.html
+          - language: py
+            name: apache_beam.io.tfrecordio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.tfrecordio.html
+      - transform: XmlIO
+        description: Transforms for reading and writing XML files using [JAXB](https://www.oracle.com/technical-resources/articles/javase/jaxb.html) mappers.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.xml.XmlIO
+            url: https://beam.apache.org/releases/javadoc/2.3.0/org/apache/beam/sdk/io/xml/XmlIO.html
+      - transform: TikaIO
+        description: Transforms for parsing arbitrary files using [Apache Tika](https://tika.apache.org/).
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.tika.TikaIO
+            url: https://beam.apache.org/releases/javadoc/2.3.0/org/apache/beam/sdk/io/tika/TikaIO.html
+      - transform: ParquetIO
+        description: IO for reading from and writing to [Parquet](https://parquet.apache.org/) files.
+        docs: /documentation/io/built-in/parquet/
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.parquet.ParquetIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/parquet/ParquetIO.html
+          - language: py
+            name: apache_beam.io.parquetio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.parquetio.html
+      - transform: ThriftIO
+        description: PTransforms for reading and writing files containing [Thrift](https://thrift.apache.org/)-encoded data.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.thrift.ThriftIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/thrift/ThriftIO.html
+      - transform: VcfIO
+        description: A source for reading from [VCF files](https://samtools.github.io/hts-specs/VCFv4.2.pdf) (version 4.x).
+        implementations:
+          - language: py
+            name: apache_beam.io.vcfio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.vcfio.html
+      - transform: S3IO
+        description: A source for reading from and writing to [Amazon S3](https://aws.amazon.com/s3/).
+        implementations:
+          - language: py
+            name: apache_beam.io.aws.s3io
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.aws.s3io.html
+      - transform: GcsIO
+        description: A source for reading from and writing to [Google Cloud Storage](https://cloud.google.com/storage).
+        implementations:
+          - language: py
+            name: apache_beam.io.gcp.gcsio
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.gcp.gcsio.html
+  - name: FileSystem
+    description: Beam provides a File system interface that defines APIs for writing file systems agnostic code. Several I/O connectors are implemented as a FileSystem implementation.
+    rows:
+      - transform: HadoopFileSystem
+        description: "`FileSystem` implementation for accessing [Hadoop](https://hadoop.apache.org/) Distributed File System files."
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.hdfs.HadoopFileSystemRegistrar
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/hdfs/HadoopFileSystemRegistrar.html
+          - language: py
+            name: apache_beam.io.hadoopfilesystem
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.hadoopfilesystem.html
+      - transform: GcsFileSystem
+        description: "`FileSystem` implementation for [Google Cloud Storage](https://cloud.google.com/storage)."
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.extensions.gcp.storage.GcsFileSystemRegistrar
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/extensions/gcp/storage/GcsFileSystemRegistrar.html
+          - language: py
+            name: apache_beam.io.gcp.gcsfilesystem
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.gcp.gcsfilesystem.html
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/gcs
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/gcs
+      - transform: LocalFileSystem
+        description: "`FileSystem` implementation for accessing files on disk."
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.LocalFileSystemRegistrar
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/LocalFileSystemRegistrar.html
+          - language: py
+            name: apache_beam.io.localfilesystem
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.localfilesystem.html
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/local
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/local
+      - transform: S3FileSystem
+        description: "`FileSystem` implementation for [Amazon S3](https://aws.amazon.com/s3/)."
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.aws.s3.S3FileSystemRegistrar
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/hdfs/package-summary.html
+      - transform: In-memory
+        description: "`FileSystem` implementation in memory; useful for testing."
+        implementations:
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/memfs
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/filesystem/memfs
+  - name: Messaging
+    description: These I/O connectors typically involve working with unbounded sources that come from messaging sources.
+    rows:
+      - transform: KinesisIO
+        description: PTransforms for reading from and writing to [Kinesis](https://aws.amazon.com/kinesis/) streams.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.kinesis.KinesisIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/kinesis/KinesisIO.html
+      - transform: AmqpIO
+        description: AMQP 1.0 protocol using the Apache QPid Proton-J library
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.amqp.AmqpIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/amqp/AmqpIO.html
+      - transform: KafkaIO
+        description: Read and Write PTransforms for [Apache Kafka](https://kafka.apache.org/).
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.kafka.KafkaIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/kafka/KafkaIO.html
+          - language: py
+            name: apache_beam.io.external.kafka
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.external.kafka.html
+      - transform: PubSubIO
+        description: Read and Write PTransforms for [Google Cloud Pub/Sub](https://cloud.google.com/pubsub) streams.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.gcp.pubsub.PubsubIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.html
+          - language: py
+            name: apache_beam.io.gcp.pubsub
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.gcp.pubsub.html
+          - language: py
+            name: apache_beam.io.external.gcp.pubsub
+            url: https://beam.apache.org/releases/pydoc/current/apache_beam.io.external.gcp.pubsub.html
+          - language: go
+            name: github.com/apache/beam/sdks/go/pkg/beam/io/pubsubio
+            url: https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam/io/pubsubio
+      - transform: JmsIO
+        description: An unbounded source for [JMS](https://www.oracle.com/java/technologies/java-message-service.html) destinations (queues or topics).
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.jms.JmsIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/jms/JmsIO.html
+      - transform: MqttIO
+        description: An unbounded source for [MQTT](https://mqtt.org/) broker.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.mqtt.MqttIO
+            url: https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/mqtt/MqttIO.html
+      - transform: RabbitMqIO
+        description: A IO to publish or consume messages with a RabbitMQ broker.
+        implementations:
+          - language: java
+            name: org.apache.beam.sdk.io.rabbitmq.RabbitMqIO
+            url: https://github.com/apache/beam/blob/master/sdks/java/io/rabbitmq/src/main/java/org/apache/beam/sdk/io/rabbitmq/RabbitMqIO.java

Review comment:
       I've also made https://issues.apache.org/jira/browse/BEAM-10099




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org