You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/24 15:22:07 UTC

[GitHub] [beam] robertwb commented on a diff in pull request #22020: Removes examples of unscalable sinks from documentation.

robertwb commented on code in PR #22020:
URL: https://github.com/apache/beam/pull/22020#discussion_r906168373


##########
website/www/site/content/en/documentation/io/developing-io-overview.md:
##########
@@ -180,9 +180,17 @@ To create a Beam sink, we recommend that you use a `ParDo` that writes the
 received records to the data store. To develop more complex sinks (for example,
 to support data de-duplication when failures are retried by a runner), use
 `ParDo`, `GroupByKey`, and other available Beam transforms.
+Many data services are optimized to write batches of elements at a time,
+so it may make sense to group the elements into batches before writing.
+Persistant connectons can be initialized in a DoFn's `setUp` or `startBundle`
+method rather than upon the receipt of every element as well.

Review Comment:
   I don't think any sinks to date have found a use/need for shared, but I suppose it's possible. I don't think it's common/specific enough to mention here though. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org