You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Marco Mistroni <mm...@gmail.com> on 2019/12/17 22:01:04 UTC

Please assist; how do i use a Sample transform ?

HI all
beam noob.

 i have written a beam app where i am processing content of a file
for dbeugging purposes, i wanted to get a samle of the lines in the
file..using
the Sample combiner, but i cannot find any examples in python
Here's my rough code

...
| 'Filter only row longer than 100 chars' >> beam.Filter(lambda row: len(row)
> 100)
| 'sampling lines' >> beam.transforms.combiners.Sample()

but the code above gives me

TypeError: unsupported operand type(s) for >>: 'str' and 'Sample'
Could anyone help?
kind regards
Marco

Re: Please assist; how do i use a Sample transform ?

Posted by Marco Mistroni <mm...@gmail.com>.
Many thanks for your help..

On Wed, Dec 18, 2019, 12:45 AM Kyle Weaver <kc...@google.com> wrote:

> We could make the Sample class uninstantiable to give a slightly more
> specific error here. Not sure how much that would help though.
>
> On Tue, Dec 17, 2019 at 4:40 PM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> beam.transforms.combiners.Sample is a container class that hails back
>> to the days when folks more familiar with Java were just copying
>> things over, and is just an empty class containing actual transforms
>> (as Kyle indicates). These are shorthand for
>> beam.CombineGlobally(beam.transforms.combiners.SampleCombineFn(...)),
>> beam.CombinePerKey(beam.transforms.combiners.SampleCombineFn(...)), ,
>> etc.
>>
>> On Tue, Dec 17, 2019 at 2:10 PM Kyle Weaver <kc...@google.com> wrote:
>> >
>> > Looks like you need to choose a subclass of sample. Probably
>> FixedSizeGlobally in your case. For example,
>> >
>> > beam.transforms.combiners.Sample.FixedSizeGlobally(5)
>> >
>> > Source:
>> https://github.com/apache/beam/blob/df376164fee1a8f54f3ad00c45190b813ffbdd34/sdks/python/apache_beam/transforms/combiners.py#L619
>> >
>> > On Tue, Dec 17, 2019 at 2:01 PM Marco Mistroni <mm...@gmail.com>
>> wrote:
>> >>
>> >> HI all
>> >> beam noob.
>> >>
>> >>  i have written a beam app where i am processing content of a file
>> >> for dbeugging purposes, i wanted to get a samle of the lines in the
>> file..using
>> >> the Sample combiner, but i cannot find any examples in python
>> >> Here's my rough code
>> >>
>> >> ...
>> >> | 'Filter only row longer than 100 chars' >> beam.Filter(lambda row:
>> len(row) > 100)
>> >> | 'sampling lines' >> beam.transforms.combiners.Sample()
>> >>
>> >> but the code above gives me
>> >>
>> >> TypeError: unsupported operand type(s) for >>: 'str' and 'Sample'
>> >> Could anyone help?
>> >> kind regards
>> >> Marco
>> >>
>> >>
>> >>
>>
>

Re: Please assist; how do i use a Sample transform ?

Posted by Kyle Weaver <kc...@google.com>.
We could make the Sample class uninstantiable to give a slightly more
specific error here. Not sure how much that would help though.

On Tue, Dec 17, 2019 at 4:40 PM Robert Bradshaw <ro...@google.com> wrote:

> beam.transforms.combiners.Sample is a container class that hails back
> to the days when folks more familiar with Java were just copying
> things over, and is just an empty class containing actual transforms
> (as Kyle indicates). These are shorthand for
> beam.CombineGlobally(beam.transforms.combiners.SampleCombineFn(...)),
> beam.CombinePerKey(beam.transforms.combiners.SampleCombineFn(...)), ,
> etc.
>
> On Tue, Dec 17, 2019 at 2:10 PM Kyle Weaver <kc...@google.com> wrote:
> >
> > Looks like you need to choose a subclass of sample. Probably
> FixedSizeGlobally in your case. For example,
> >
> > beam.transforms.combiners.Sample.FixedSizeGlobally(5)
> >
> > Source:
> https://github.com/apache/beam/blob/df376164fee1a8f54f3ad00c45190b813ffbdd34/sdks/python/apache_beam/transforms/combiners.py#L619
> >
> > On Tue, Dec 17, 2019 at 2:01 PM Marco Mistroni <mm...@gmail.com>
> wrote:
> >>
> >> HI all
> >> beam noob.
> >>
> >>  i have written a beam app where i am processing content of a file
> >> for dbeugging purposes, i wanted to get a samle of the lines in the
> file..using
> >> the Sample combiner, but i cannot find any examples in python
> >> Here's my rough code
> >>
> >> ...
> >> | 'Filter only row longer than 100 chars' >> beam.Filter(lambda row:
> len(row) > 100)
> >> | 'sampling lines' >> beam.transforms.combiners.Sample()
> >>
> >> but the code above gives me
> >>
> >> TypeError: unsupported operand type(s) for >>: 'str' and 'Sample'
> >> Could anyone help?
> >> kind regards
> >> Marco
> >>
> >>
> >>
>

Re: Please assist; how do i use a Sample transform ?

Posted by Robert Bradshaw <ro...@google.com>.
beam.transforms.combiners.Sample is a container class that hails back
to the days when folks more familiar with Java were just copying
things over, and is just an empty class containing actual transforms
(as Kyle indicates). These are shorthand for
beam.CombineGlobally(beam.transforms.combiners.SampleCombineFn(...)),
beam.CombinePerKey(beam.transforms.combiners.SampleCombineFn(...)), ,
etc.

On Tue, Dec 17, 2019 at 2:10 PM Kyle Weaver <kc...@google.com> wrote:
>
> Looks like you need to choose a subclass of sample. Probably FixedSizeGlobally in your case. For example,
>
> beam.transforms.combiners.Sample.FixedSizeGlobally(5)
>
> Source: https://github.com/apache/beam/blob/df376164fee1a8f54f3ad00c45190b813ffbdd34/sdks/python/apache_beam/transforms/combiners.py#L619
>
> On Tue, Dec 17, 2019 at 2:01 PM Marco Mistroni <mm...@gmail.com> wrote:
>>
>> HI all
>> beam noob.
>>
>>  i have written a beam app where i am processing content of a file
>> for dbeugging purposes, i wanted to get a samle of the lines in the file..using
>> the Sample combiner, but i cannot find any examples in python
>> Here's my rough code
>>
>> ...
>> | 'Filter only row longer than 100 chars' >> beam.Filter(lambda row: len(row) > 100)
>> | 'sampling lines' >> beam.transforms.combiners.Sample()
>>
>> but the code above gives me
>>
>> TypeError: unsupported operand type(s) for >>: 'str' and 'Sample'
>> Could anyone help?
>> kind regards
>> Marco
>>
>>
>>

Re: Please assist; how do i use a Sample transform ?

Posted by Kyle Weaver <kc...@google.com>.
Looks like you need to choose a subclass of sample. Probably
FixedSizeGlobally in your case. For example,

beam.transforms.combiners.*Sample.FixedSizeGlobally(5)*

Source:
https://github.com/apache/beam/blob/df376164fee1a8f54f3ad00c45190b813ffbdd34/sdks/python/apache_beam/transforms/combiners.py#L619

On Tue, Dec 17, 2019 at 2:01 PM Marco Mistroni <mm...@gmail.com> wrote:

> HI all
> beam noob.
>
>  i have written a beam app where i am processing content of a file
> for dbeugging purposes, i wanted to get a samle of the lines in the
> file..using
> the Sample combiner, but i cannot find any examples in python
> Here's my rough code
>
> ...
> | 'Filter only row longer than 100 chars' >> beam.Filter(lambda row: len(row)
> > 100)
> | 'sampling lines' >> beam.transforms.combiners.Sample()
>
> but the code above gives me
>
> TypeError: unsupported operand type(s) for >>: 'str' and 'Sample'
> Could anyone help?
> kind regards
> Marco
>
>
>
>