You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by "李劲松(之信)" <zh...@alibaba-inc.com> on 2016/11/08 03:44:58 UTC

Verify a new Runner

Hi there,

I'm working on the beam integration (= a new runner, mainly for streaming; almost done) for an internal system at Alibaba, targeted for production use. I'm wondering if you could give me some advice on how to test/verify such an implementation. Thank you!

Best,
Zhixin

回复:Verify a new Runner

Posted by "李劲松(之信)" <zh...@alibaba-inc.com>.
Thanks a lot Kenn for the information! They are very helpful.

We have a plan now and are working on the test... will let you know if we encounter any problem. Thank you!

Best,
Zhixin
------------------------------------------------------------------发件人:Kenneth Knowles <kl...@google.com>发送时间:2016年11月8日(星期二) 12:11收件人:李劲松(之信) <zh...@alibaba-inc.com>; dev <de...@beam.incubator.apache.org>抄 送:钱正平(布民) <zh...@alibaba-inc.com>; 金晓军(仙隐) <xi...@alibaba-inc.com>; 廖新涛(蚩尤) <ch...@taobao.com>主 题:Re: Verify a new Runner
Hi Zhixin,
I would love to help you out with this.
One of the best ways to test your runner is to enable the "RunnableOnService" test suite in the core SDK. Here is an example of the configuration for the Flink runner: https://github.com/apache/incubator-beam/blob/master/runners/flink/runner/pom.xml#L49
Another good source of tests is the integration tests which makes sure your runner can run all of our user-facing examples. The configuration for this lives here: https://github.com/apache/incubator-beam/blob/master/examples/java/pom.xml#L133
You said you are almost done, but just because you might find some issues when you start running these tests, here are some pointers to other references.
The standard abstract description of the model is still the Dataflow Model paper from a couple years ago: http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43864.pdf.
And finally, we don't have extensive documentation, but best current reference for what transforms are primitive (and why) for a runner author is probably the Beam Runner API proposal at https://s.apache.org/beam-runner-api. The implementation (moving Beam to the ideal) is still under development, but you may be helped by the sections "Primitive Transforms" at https://s.apache.org/beam-runner-api#heading=h.tt55lhd3k6by and "What does a runner author need to do?" at https://s.apache.org/beam-runner-api#heading=h.cdbhozvw83un.
I hope this helps. And, of course, if there are more details you can share then we can talk about specifics.
Kenn
On Mon, Nov 7, 2016 at 7:50 PM 李劲松(之信) <zh...@alibaba-inc.com> wrote:
Hi there,


I'm working on the beam integration (= a new runner, mainly for streaming; almost done) for an internal system at Alibaba, targeted for production use. I'm wondering if you could give me some advice on how to test/verify such an implementation. Thank you!


Best,

Zhixin

Re: Verify a new Runner

Posted by Kenneth Knowles <kl...@google.com.INVALID>.
Hi Zhixin,

I would love to help you out with this.

One of the best ways to test your runner is to enable the
"RunnableOnService" test suite in the core SDK. Here is an example of the
configuration for the Flink runner:
https://github.com/apache/incubator-beam/blob/master/runners/flink/runner/pom.xml#L49

Another good source of tests is the integration tests which makes sure your
runner can run all of our user-facing examples. The configuration for this
lives here:
https://github.com/apache/incubator-beam/blob/master/examples/java/pom.xml#L133

You said you are almost done, but just because you might find some issues
when you start running these tests, here are some pointers to other
references.

The standard abstract description of the model is still the Dataflow Model
paper from a couple years ago:
http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43864.pdf
.

And finally, we don't have extensive documentation, but best current
reference for what transforms are primitive (and why) for a runner author
is probably the Beam Runner API proposal at
https://s.apache.org/beam-runner-api. The implementation (moving Beam to
the ideal) is still under development, but you may be helped by the
sections "Primitive Transforms" at
https://s.apache.org/beam-runner-api#heading=h.tt55lhd3k6by and "What does
a runner author need to do?" at https://s.apache.org/beam-runner-api
#heading=h.cdbhozvw83un.

I hope this helps. And, of course, if there are more details you can share
then we can talk about specifics.

Kenn

On Mon, Nov 7, 2016 at 7:50 PM 李劲松(之信) <zh...@alibaba-inc.com> wrote:

> Hi there,
>
> I'm working on the beam integration (= a new runner, mainly for streaming;
> almost done) for an internal system at Alibaba, targeted for production use. I'm wondering if you could give me some advice on how to test/verify such an implementation. Thank you!
>
> Best,
> Zhixin