You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2021/02/19 17:17:06 UTC

[jira] [Commented] (BEAM-11633) Steer people towards ParDo, SDF, instead of the original Source framework

    [ https://issues.apache.org/jira/browse/BEAM-11633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287232#comment-17287232 ] 

Beam JIRA Bot commented on BEAM-11633:
--------------------------------------

This issue is assigned but has not received an update in 30 days so it has been labeled "stale-assigned". If you are still working on the issue, please give an update and remove the label. If you are no longer working on the issue, please unassign so someone else may work on it. In 7 days the issue will be automatically unassigned.

> Steer people towards ParDo, SDF, instead of the original Source framework
> -------------------------------------------------------------------------
>
>                 Key: BEAM-11633
>                 URL: https://issues.apache.org/jira/browse/BEAM-11633
>             Project: Beam
>          Issue Type: Bug
>          Components: website
>            Reporter: Ahmet Altay
>            Assignee: Boyuan Zhang
>            Priority: P2
>              Labels: stale-assigned
>
> People still write sources, where 90% of the time they shouldn't. We tell them [not to|https://beam.apache.org/documentation/io/developing-io-overview/], but we should do so more effectively. In particular, the instructions for the ParDo alternative suffer from not being able to name Reshuffle explicitly, when it's exactly what should be used here. It should also mention that the ParDo needs to be seeded by a Create step or similar.
> A big issue here is that Sources are called "Sources". When a new developer is looking to author a pipeline, this is the first place they will look, especially if they're just scanning or searching through documentation. We need to aggressively counteract the gravity of the current naming scheme.
> Suggestion: Improve the documentation mentioned above, and update the Javadoc for BoundedSource, etc., to steer people away from it. If they are part of the small collection of power users who need a source, they'll be okay.
> Suggestions for future work:
>  - Consider deprecating source framework in favor of SDF.
>  - Point to SDF docs (and simplify SDF docs)
>  - Also many users can simply just use FileIO.matchAll followed by a ParDo. Recommend those types of alternatives.
> Assigning this to [~boyuanz] anyone could help here.
>  /cc [~kenn] [~chamikara] [~reuvenlax] [~robertwb] [~rtnguyen] [~dcavazos]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)