You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Rafael Fernandez <rf...@google.com> on 2018/02/28 18:23:22 UTC

Proposed improvements to our documentation

Hi folks,

I think we've all seen a few areas of improvement here and there in our
docs. For example, one can find a a Javadoc entry with outdated content
here and there [1], or "sample" code snippets that have problems, such as
not compiling [2].

I think a good thing to do is to invest in extending our documentation to
having a robust per-transform reference, which has samples and a good
description of what the transform does, and keep JavaDoc as a solid source
of API documentation. I believe similar approaches can benefit Python and
other languages.

What do you think? I'm happy to spend some time now and then and
incrementaly move in this direction. I would like some help from the
community with reviews, suggestions (and perhaps picking up associated
JIRAs as I file them.) Good idea? Bad? Try? +1?

Thanks,
r

[1] See
https://github.com/apache/beam/blob/a629f73ee4e64c470e0c78cc6f51b8625d781b41/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/CombineWithContext.java
, which contains a stale reference to KeyedCombineFn .

[2]
https://github.com/apache/beam/blob/5fb30ec8265c841cd8c4e6ae16b43be1f171eabb/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/FlatMapElements.java#L65

Re: Proposed improvements to our documentation

Posted by Ahmet Altay <al...@google.com>.
+1

I think this is a great idea, it can also serve as an inventory of where a
language might be lacking in transforms and provide a good starting point
for new contributors to fill in those gaps by looking at the existing Java
implementations.

On Wed, Feb 28, 2018 at 10:53 AM, Lukasz Cwik <lc...@google.com> wrote:

> +1
>
> On Wed, Feb 28, 2018 at 10:46 AM, Kenneth Knowles <kl...@google.com> wrote:
>
>> Yes! I love the idea of having a good cross-language transform reference
>> on the web site. Very good idea to get started now and provide the
>> skeleton, then fill out additional transforms and additional languages
>> incrementally.
>>
>> Kenn
>>
>> On Wed, Feb 28, 2018 at 10:23 AM, Rafael Fernandez <rf...@google.com>
>> wrote:
>>
>>> Hi folks,
>>>
>>> I think we've all seen a few areas of improvement here and there in our
>>> docs. For example, one can find a a Javadoc entry with outdated content
>>> here and there [1], or "sample" code snippets that have problems, such as
>>> not compiling [2].
>>>
>>> I think a good thing to do is to invest in extending our documentation
>>> to having a robust per-transform reference, which has samples and a good
>>> description of what the transform does, and keep JavaDoc as a solid source
>>> of API documentation. I believe similar approaches can benefit Python and
>>> other languages.
>>>
>>> What do you think? I'm happy to spend some time now and then and
>>> incrementaly move in this direction. I would like some help from the
>>> community with reviews, suggestions (and perhaps picking up associated
>>> JIRAs as I file them.) Good idea? Bad? Try? +1?
>>>
>>> Thanks,
>>> r
>>>
>>> [1] See https://github.com/apache/beam/blob/a629f73ee4e64c470e0c78cc
>>> 6f51b8625d781b41/sdks/java/core/src/main/java/org/apache/bea
>>> m/sdk/transforms/CombineWithContext.java , which contains a stale
>>> reference to KeyedCombineFn .
>>>
>>> [2] https://github.com/apache/beam/blob/5fb30ec8265c841cd8c4
>>> e6ae16b43be1f171eabb/sdks/java/core/src/main/java/org/apache
>>> /beam/sdk/transforms/FlatMapElements.java#L65
>>>
>>
>>
>

Re: Proposed improvements to our documentation

Posted by Rafael Fernandez <rf...@google.com>.
Thanks for all the feedback. I have filed a JIRA [1] to get started.

https://issues.apache.org/jira/browse/BEAM-3763


On Wed, Feb 28, 2018 at 12:11 PM Reuven Lax <re...@google.com> wrote:

> +1 - to many things are documented only in Javadoc. While there are some
> users who are more likely to read Javadoc (e.g. via an IDE), we should try
> and have this part of our public documentation. This will help us document
> the other languages as well. I've noticed that some basic things (e.g. how
> do I access the current window inside a ParDo) are not easy to discover in
> our documentation.
>
> Also strong +1 to Eugene's proposal. Much of our documentation is
> base-level documentation. i.e. we document the low-level concepts such as
> PCollection, etc. However there's a strong need for use-case based
> documentation.
>
> Reuven
>
>
> On Wed, Feb 28, 2018 at 11:58 AM Eugene Kirpichov <ki...@google.com>
> wrote:
>
>> +1 sounds reasonable.
>>
>> A couple more areas where our documentation could use some work:
>>
>> - I'm feeling very strongly that the documentation of windowing/triggers
>> is due for a complete rewrite. It was written when Beam was first being
>> revealed to the world, and now we have both extensive experience with it
>> ourselves, as well as extensive experience explaining it to users and
>> seeing what users get wrong in practice.
>>
>> - It'd be good if we had in-depth articles in the documentation on common
>> but broad topics, such as "How do I enrich a stream", "How do I join two
>> streams", "How do I efficiently call an external REST service", "How do I
>> express sequencing, do X then Y", "How do I maintain a running
>> sliding-window aggregation" etc.
>>
>> On Wed, Feb 28, 2018 at 11:00 AM Chamikara Jayalath <ch...@google.com>
>> wrote:
>>
>>> +1
>>>
>>> A per-transform reference will definitely help Python (and Go ?) since
>>> some transforms lack detailed documentation compared to Java. Additionally
>>> it might be a good idea to compare Java/Py/Go docs in general to make sure
>>> there are no inconsistencies.
>>>
>>> - Cham
>>>
>>> On Wed, Feb 28, 2018 at 10:53 AM Lukasz Cwik <lc...@google.com> wrote:
>>>
>>>> +1
>>>>
>>>> On Wed, Feb 28, 2018 at 10:46 AM, Kenneth Knowles <kl...@google.com>
>>>> wrote:
>>>>
>>>>> Yes! I love the idea of having a good cross-language transform
>>>>> reference on the web site. Very good idea to get started now and provide
>>>>> the skeleton, then fill out additional transforms and additional languages
>>>>> incrementally.
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Wed, Feb 28, 2018 at 10:23 AM, Rafael Fernandez <
>>>>> rfernand@google.com> wrote:
>>>>>
>>>>>> Hi folks,
>>>>>>
>>>>>> I think we've all seen a few areas of improvement here and there in
>>>>>> our docs. For example, one can find a a Javadoc entry with outdated content
>>>>>> here and there [1], or "sample" code snippets that have problems, such as
>>>>>> not compiling [2].
>>>>>>
>>>>>> I think a good thing to do is to invest in extending our
>>>>>> documentation to having a robust per-transform reference, which has samples
>>>>>> and a good description of what the transform does, and keep JavaDoc as a
>>>>>> solid source of API documentation. I believe similar approaches can benefit
>>>>>> Python and other languages.
>>>>>>
>>>>>> What do you think? I'm happy to spend some time now and then and
>>>>>> incrementaly move in this direction. I would like some help from the
>>>>>> community with reviews, suggestions (and perhaps picking up associated
>>>>>> JIRAs as I file them.) Good idea? Bad? Try? +1?
>>>>>>
>>>>>> Thanks,
>>>>>> r
>>>>>>
>>>>>> [1] See
>>>>>> https://github.com/apache/beam/blob/a629f73ee4e64c470e0c78cc6f51b8625d781b41/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/CombineWithContext.java
>>>>>> , which contains a stale reference to KeyedCombineFn .
>>>>>>
>>>>>> [2]
>>>>>> https://github.com/apache/beam/blob/5fb30ec8265c841cd8c4e6ae16b43be1f171eabb/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/FlatMapElements.java#L65
>>>>>>
>>>>>
>>>>>
>>>>

Re: Proposed improvements to our documentation

Posted by Reuven Lax <re...@google.com>.
+1 - to many things are documented only in Javadoc. While there are some
users who are more likely to read Javadoc (e.g. via an IDE), we should try
and have this part of our public documentation. This will help us document
the other languages as well. I've noticed that some basic things (e.g. how
do I access the current window inside a ParDo) are not easy to discover in
our documentation.

Also strong +1 to Eugene's proposal. Much of our documentation is
base-level documentation. i.e. we document the low-level concepts such as
PCollection, etc. However there's a strong need for use-case based
documentation.

Reuven


On Wed, Feb 28, 2018 at 11:58 AM Eugene Kirpichov <ki...@google.com>
wrote:

> +1 sounds reasonable.
>
> A couple more areas where our documentation could use some work:
>
> - I'm feeling very strongly that the documentation of windowing/triggers
> is due for a complete rewrite. It was written when Beam was first being
> revealed to the world, and now we have both extensive experience with it
> ourselves, as well as extensive experience explaining it to users and
> seeing what users get wrong in practice.
>
> - It'd be good if we had in-depth articles in the documentation on common
> but broad topics, such as "How do I enrich a stream", "How do I join two
> streams", "How do I efficiently call an external REST service", "How do I
> express sequencing, do X then Y", "How do I maintain a running
> sliding-window aggregation" etc.
>
> On Wed, Feb 28, 2018 at 11:00 AM Chamikara Jayalath <ch...@google.com>
> wrote:
>
>> +1
>>
>> A per-transform reference will definitely help Python (and Go ?) since
>> some transforms lack detailed documentation compared to Java. Additionally
>> it might be a good idea to compare Java/Py/Go docs in general to make sure
>> there are no inconsistencies.
>>
>> - Cham
>>
>> On Wed, Feb 28, 2018 at 10:53 AM Lukasz Cwik <lc...@google.com> wrote:
>>
>>> +1
>>>
>>> On Wed, Feb 28, 2018 at 10:46 AM, Kenneth Knowles <kl...@google.com>
>>> wrote:
>>>
>>>> Yes! I love the idea of having a good cross-language transform
>>>> reference on the web site. Very good idea to get started now and provide
>>>> the skeleton, then fill out additional transforms and additional languages
>>>> incrementally.
>>>>
>>>> Kenn
>>>>
>>>> On Wed, Feb 28, 2018 at 10:23 AM, Rafael Fernandez <rfernand@google.com
>>>> > wrote:
>>>>
>>>>> Hi folks,
>>>>>
>>>>> I think we've all seen a few areas of improvement here and there in
>>>>> our docs. For example, one can find a a Javadoc entry with outdated content
>>>>> here and there [1], or "sample" code snippets that have problems, such as
>>>>> not compiling [2].
>>>>>
>>>>> I think a good thing to do is to invest in extending our documentation
>>>>> to having a robust per-transform reference, which has samples and a good
>>>>> description of what the transform does, and keep JavaDoc as a solid source
>>>>> of API documentation. I believe similar approaches can benefit Python and
>>>>> other languages.
>>>>>
>>>>> What do you think? I'm happy to spend some time now and then and
>>>>> incrementaly move in this direction. I would like some help from the
>>>>> community with reviews, suggestions (and perhaps picking up associated
>>>>> JIRAs as I file them.) Good idea? Bad? Try? +1?
>>>>>
>>>>> Thanks,
>>>>> r
>>>>>
>>>>> [1] See
>>>>> https://github.com/apache/beam/blob/a629f73ee4e64c470e0c78cc6f51b8625d781b41/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/CombineWithContext.java
>>>>> , which contains a stale reference to KeyedCombineFn .
>>>>>
>>>>> [2]
>>>>> https://github.com/apache/beam/blob/5fb30ec8265c841cd8c4e6ae16b43be1f171eabb/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/FlatMapElements.java#L65
>>>>>
>>>>
>>>>
>>>

Re: Proposed improvements to our documentation

Posted by Eugene Kirpichov <ki...@google.com>.
+1 sounds reasonable.

A couple more areas where our documentation could use some work:

- I'm feeling very strongly that the documentation of windowing/triggers is
due for a complete rewrite. It was written when Beam was first being
revealed to the world, and now we have both extensive experience with it
ourselves, as well as extensive experience explaining it to users and
seeing what users get wrong in practice.

- It'd be good if we had in-depth articles in the documentation on common
but broad topics, such as "How do I enrich a stream", "How do I join two
streams", "How do I efficiently call an external REST service", "How do I
express sequencing, do X then Y", "How do I maintain a running
sliding-window aggregation" etc.

On Wed, Feb 28, 2018 at 11:00 AM Chamikara Jayalath <ch...@google.com>
wrote:

> +1
>
> A per-transform reference will definitely help Python (and Go ?) since
> some transforms lack detailed documentation compared to Java. Additionally
> it might be a good idea to compare Java/Py/Go docs in general to make sure
> there are no inconsistencies.
>
> - Cham
>
> On Wed, Feb 28, 2018 at 10:53 AM Lukasz Cwik <lc...@google.com> wrote:
>
>> +1
>>
>> On Wed, Feb 28, 2018 at 10:46 AM, Kenneth Knowles <kl...@google.com> wrote:
>>
>>> Yes! I love the idea of having a good cross-language transform reference
>>> on the web site. Very good idea to get started now and provide the
>>> skeleton, then fill out additional transforms and additional languages
>>> incrementally.
>>>
>>> Kenn
>>>
>>> On Wed, Feb 28, 2018 at 10:23 AM, Rafael Fernandez <rf...@google.com>
>>> wrote:
>>>
>>>> Hi folks,
>>>>
>>>> I think we've all seen a few areas of improvement here and there in our
>>>> docs. For example, one can find a a Javadoc entry with outdated content
>>>> here and there [1], or "sample" code snippets that have problems, such as
>>>> not compiling [2].
>>>>
>>>> I think a good thing to do is to invest in extending our documentation
>>>> to having a robust per-transform reference, which has samples and a good
>>>> description of what the transform does, and keep JavaDoc as a solid source
>>>> of API documentation. I believe similar approaches can benefit Python and
>>>> other languages.
>>>>
>>>> What do you think? I'm happy to spend some time now and then and
>>>> incrementaly move in this direction. I would like some help from the
>>>> community with reviews, suggestions (and perhaps picking up associated
>>>> JIRAs as I file them.) Good idea? Bad? Try? +1?
>>>>
>>>> Thanks,
>>>> r
>>>>
>>>> [1] See
>>>> https://github.com/apache/beam/blob/a629f73ee4e64c470e0c78cc6f51b8625d781b41/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/CombineWithContext.java
>>>> , which contains a stale reference to KeyedCombineFn .
>>>>
>>>> [2]
>>>> https://github.com/apache/beam/blob/5fb30ec8265c841cd8c4e6ae16b43be1f171eabb/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/FlatMapElements.java#L65
>>>>
>>>
>>>
>>

Re: Proposed improvements to our documentation

Posted by Chamikara Jayalath <ch...@google.com>.
+1

A per-transform reference will definitely help Python (and Go ?) since some
transforms lack detailed documentation compared to Java. Additionally it
might be a good idea to compare Java/Py/Go docs in general to make sure
there are no inconsistencies.

- Cham

On Wed, Feb 28, 2018 at 10:53 AM Lukasz Cwik <lc...@google.com> wrote:

> +1
>
> On Wed, Feb 28, 2018 at 10:46 AM, Kenneth Knowles <kl...@google.com> wrote:
>
>> Yes! I love the idea of having a good cross-language transform reference
>> on the web site. Very good idea to get started now and provide the
>> skeleton, then fill out additional transforms and additional languages
>> incrementally.
>>
>> Kenn
>>
>> On Wed, Feb 28, 2018 at 10:23 AM, Rafael Fernandez <rf...@google.com>
>> wrote:
>>
>>> Hi folks,
>>>
>>> I think we've all seen a few areas of improvement here and there in our
>>> docs. For example, one can find a a Javadoc entry with outdated content
>>> here and there [1], or "sample" code snippets that have problems, such as
>>> not compiling [2].
>>>
>>> I think a good thing to do is to invest in extending our documentation
>>> to having a robust per-transform reference, which has samples and a good
>>> description of what the transform does, and keep JavaDoc as a solid source
>>> of API documentation. I believe similar approaches can benefit Python and
>>> other languages.
>>>
>>> What do you think? I'm happy to spend some time now and then and
>>> incrementaly move in this direction. I would like some help from the
>>> community with reviews, suggestions (and perhaps picking up associated
>>> JIRAs as I file them.) Good idea? Bad? Try? +1?
>>>
>>> Thanks,
>>> r
>>>
>>> [1] See
>>> https://github.com/apache/beam/blob/a629f73ee4e64c470e0c78cc6f51b8625d781b41/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/CombineWithContext.java
>>> , which contains a stale reference to KeyedCombineFn .
>>>
>>> [2]
>>> https://github.com/apache/beam/blob/5fb30ec8265c841cd8c4e6ae16b43be1f171eabb/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/FlatMapElements.java#L65
>>>
>>
>>
>

Re: Proposed improvements to our documentation

Posted by Lukasz Cwik <lc...@google.com>.
+1

On Wed, Feb 28, 2018 at 10:46 AM, Kenneth Knowles <kl...@google.com> wrote:

> Yes! I love the idea of having a good cross-language transform reference
> on the web site. Very good idea to get started now and provide the
> skeleton, then fill out additional transforms and additional languages
> incrementally.
>
> Kenn
>
> On Wed, Feb 28, 2018 at 10:23 AM, Rafael Fernandez <rf...@google.com>
> wrote:
>
>> Hi folks,
>>
>> I think we've all seen a few areas of improvement here and there in our
>> docs. For example, one can find a a Javadoc entry with outdated content
>> here and there [1], or "sample" code snippets that have problems, such as
>> not compiling [2].
>>
>> I think a good thing to do is to invest in extending our documentation to
>> having a robust per-transform reference, which has samples and a good
>> description of what the transform does, and keep JavaDoc as a solid source
>> of API documentation. I believe similar approaches can benefit Python and
>> other languages.
>>
>> What do you think? I'm happy to spend some time now and then and
>> incrementaly move in this direction. I would like some help from the
>> community with reviews, suggestions (and perhaps picking up associated
>> JIRAs as I file them.) Good idea? Bad? Try? +1?
>>
>> Thanks,
>> r
>>
>> [1] See https://github.com/apache/beam/blob/a629f73ee4e64c470e0c78cc
>> 6f51b8625d781b41/sdks/java/core/src/main/java/org/apache/
>> beam/sdk/transforms/CombineWithContext.java , which contains a stale
>> reference to KeyedCombineFn .
>>
>> [2] https://github.com/apache/beam/blob/5fb30ec8265c841cd8c4
>> e6ae16b43be1f171eabb/sdks/java/core/src/main/java/org/
>> apache/beam/sdk/transforms/FlatMapElements.java#L65
>>
>
>

Re: Proposed improvements to our documentation

Posted by Kenneth Knowles <kl...@google.com>.
Yes! I love the idea of having a good cross-language transform reference on
the web site. Very good idea to get started now and provide the skeleton,
then fill out additional transforms and additional languages incrementally.

Kenn

On Wed, Feb 28, 2018 at 10:23 AM, Rafael Fernandez <rf...@google.com>
wrote:

> Hi folks,
>
> I think we've all seen a few areas of improvement here and there in our
> docs. For example, one can find a a Javadoc entry with outdated content
> here and there [1], or "sample" code snippets that have problems, such as
> not compiling [2].
>
> I think a good thing to do is to invest in extending our documentation to
> having a robust per-transform reference, which has samples and a good
> description of what the transform does, and keep JavaDoc as a solid source
> of API documentation. I believe similar approaches can benefit Python and
> other languages.
>
> What do you think? I'm happy to spend some time now and then and
> incrementaly move in this direction. I would like some help from the
> community with reviews, suggestions (and perhaps picking up associated
> JIRAs as I file them.) Good idea? Bad? Try? +1?
>
> Thanks,
> r
>
> [1] See https://github.com/apache/beam/blob/a629f73ee4e64c470e0c78cc6f51b8
> 625d781b41/sdks/java/core/src/main/java/org/apache/beam/sdk/
> transforms/CombineWithContext.java , which contains a stale reference to
> KeyedCombineFn .
>
> [2] https://github.com/apache/beam/blob/5fb30ec8265c841cd8c4e6ae16b43b
> e1f171eabb/sdks/java/core/src/main/java/org/apache/beam/sdk/
> transforms/FlatMapElements.java#L65
>