You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Kai Jiang <ji...@gmail.com> on 2018/05/03 11:43:15 UTC

Google Summer of Code Project Intro

Hi Beam Dev,

I am Kai. GSoC has announced selected projects last week. During community
bonding period, I want to share some basics about this year's project with
Apache Beam.

Project abstract:
https://summerofcode.withgoogle.com/projects/#6460770829729792
Issue Tracker: BEAM-3783 <https://issues.apache.org/jira/browse/BEAM-3783>

This project will be mentored by Kenneth Knowles. Many thanks to Kenn's
mentorship in next three months. Also, Welcome any ideas and comments from
you!

The project will mainly focus on implementing a TPC-DS benchmark on Beam
SQL. We've seen many works have been tested on Spark, Hive and Pig, etc.
It's interesting to see what happened if it builds onto Beam SQL.
Presumably, the benchmark will test against on different runners (like,
spark or flink). Based on the benchmark, a performance report will be
generated eventually.

Proposal doc is here:    (more details will be updated)
https://docs.google.com/document/d/15oYd_jFVbkiSPGT8-XnSh7Q-R3CtZwHaizyQfmrShfo/edit?usp=sharing

Once coding period starts on May 14, I will keep updating the status and
progress of this project.

Best,
Kai
ᐧ

Re: Google Summer of Code Project Intro

Posted by Andrew Pilloud <ap...@google.com>.
Hi Kai,

Glad to hear someone is putting more work into benchmarking Beam SQL! It
would be really cool if we had some of these running as nightly performance
test jobs so we would know when there is a performance regression. This
might be out of scope of your project, but keep it in mind.

I am working on SQL and ported some of the Nexmark benchmarks there. Feel
free to email me questions. I can also poke Kenn for you whenever he's not
responsive.

Andrew

On Thu, May 3, 2018 at 4:43 AM Kai Jiang <ji...@gmail.com> wrote:

> Hi Beam Dev,
>
> I am Kai. GSoC has announced selected projects last week. During community
> bonding period, I want to share some basics about this year's project with
> Apache Beam.
>
> Project abstract:
> https://summerofcode.withgoogle.com/projects/#6460770829729792
> Issue Tracker: BEAM-3783 <https://issues.apache.org/jira/browse/BEAM-3783>
>
> This project will be mentored by Kenneth Knowles. Many thanks to Kenn's
> mentorship in next three months. Also, Welcome any ideas and comments from
> you!
>
> The project will mainly focus on implementing a TPC-DS benchmark on Beam
> SQL. We've seen many works have been tested on Spark, Hive and Pig, etc.
> It's interesting to see what happened if it builds onto Beam SQL.
> Presumably, the benchmark will test against on different runners (like,
> spark or flink). Based on the benchmark, a performance report will be
> generated eventually.
>
> Proposal doc is here:    (more details will be updated)
>
> https://docs.google.com/document/d/15oYd_jFVbkiSPGT8-XnSh7Q-R3CtZwHaizyQfmrShfo/edit?usp=sharing
>
> Once coding period starts on May 14, I will keep updating the status and
> progress of this project.
>
> Best,
> Kai
> ᐧ
>