You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Eric Fukuda <e....@gmail.com> on 2016/10/28 00:47:11 UTC

Yahoo! Streaming Benchmark with Flink

Hi,

I have two questions on the blog post on Yahoo! Streaming Benchmark with
Flink [1].

First is about the join operation to associate ad_ids and campaign_ids. In
flink.benchmark.state.AdvertisingTopologyFlinkStateHighKeyCard, I don't see
this being done. Is there a reason for this?

Second is about Akka actor. Reading
flink.benchmark.state.QueryableWindowOperator or
flink.benchmark.state.QueryableWindowOperatorEvicting, it looks like the
Akka actor is being prepared but not used in the actual processing
(processElement()). Is this correct? And how do I enable Akka in the job?

[1] http://data-artisans.com/extending-the-yahoo-streaming-benchmark/

Thanks,
Eric

Re: Yahoo! Streaming Benchmark with Flink

Posted by Eric Fukuda <e....@gmail.com>.
Thanks Till, your reply answered my questions perfectly.

Regards,
Eric

On Fri, Oct 28, 2016 at 11:00 AM, Till Rohrmann <tr...@apache.org>
wrote:

> Hi Eric,
>
> concerning your first question. I think that AdvertisingTopologyFlinkStateHighKeyCard
> models a different scenario where one tries to count the number ads per
> campaign for a large number of campaigns. In this scenario, the input data
> already contains the campaign id for each ad. I think this is the job for
> the paragraph "Winning Twitter Hack Week: Eliminating the key-value store
> bottleneck".
>
> concerning your second question. The response actor is registered at the
> registration service. The registration service exposes the akka URL of this
> actor under the index of the running task. When you run AkkaStateQuery, the
> registration is queried to retrieve the akka URL and then a query state
> request is sent to the response actor via the QueryActor. That is how the
> actor comes into play.
>
> At the moment the registration service is implemented using ZooKeeper.
> This means that the akka URL is written to ZooKeeper from where it can be
> retrieved.
>
> I hope this answers your questions.
>
> Cheers,
> Till
>
> On Fri, Oct 28, 2016 at 2:47 AM, Eric Fukuda <e....@gmail.com> wrote:
>
>> Hi,
>>
>> I have two questions on the blog post on Yahoo! Streaming Benchmark with
>> Flink [1].
>>
>> First is about the join operation to associate ad_ids and campaign_ids.
>> In flink.benchmark.state.AdvertisingTopologyFlinkStateHighKeyCard, I
>> don't see this being done. Is there a reason for this?
>>
>> Second is about Akka actor. Reading flink.benchmark.state.QueryableWindowOperator
>> or flink.benchmark.state.QueryableWindowOperatorEvicting, it looks like
>> the Akka actor is being prepared but not used in the actual processing
>> (processElement()). Is this correct? And how do I enable Akka in the job?
>>
>> [1] http://data-artisans.com/extending-the-yahoo-streaming-benchmark/
>>
>> Thanks,
>> Eric
>>
>
>

Re: Yahoo! Streaming Benchmark with Flink

Posted by Till Rohrmann <tr...@apache.org>.
Hi Eric,

concerning your first question. I think that
AdvertisingTopologyFlinkStateHighKeyCard models a different scenario where
one tries to count the number ads per campaign for a large number of
campaigns. In this scenario, the input data already contains the campaign
id for each ad. I think this is the job for the paragraph "Winning Twitter
Hack Week: Eliminating the key-value store bottleneck".

concerning your second question. The response actor is registered at the
registration service. The registration service exposes the akka URL of this
actor under the index of the running task. When you run AkkaStateQuery, the
registration is queried to retrieve the akka URL and then a query state
request is sent to the response actor via the QueryActor. That is how the
actor comes into play.

At the moment the registration service is implemented using ZooKeeper. This
means that the akka URL is written to ZooKeeper from where it can be
retrieved.

I hope this answers your questions.

Cheers,
Till

On Fri, Oct 28, 2016 at 2:47 AM, Eric Fukuda <e....@gmail.com> wrote:

> Hi,
>
> I have two questions on the blog post on Yahoo! Streaming Benchmark with
> Flink [1].
>
> First is about the join operation to associate ad_ids and campaign_ids. In
> flink.benchmark.state.AdvertisingTopologyFlinkStateHighKeyCard, I don't
> see this being done. Is there a reason for this?
>
> Second is about Akka actor. Reading flink.benchmark.state.QueryableWindowOperator
> or flink.benchmark.state.QueryableWindowOperatorEvicting, it looks like
> the Akka actor is being prepared but not used in the actual processing
> (processElement()). Is this correct? And how do I enable Akka in the job?
>
> [1] http://data-artisans.com/extending-the-yahoo-streaming-benchmark/
>
> Thanks,
> Eric
>