You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Arif Alili <a....@propellor.eu> on 2020/12/28 13:14:01 UTC

ElasticsearchIO - Read multiple PubSub topics and write to different Indices

Hi all,

I am writing to Elasticsearch using Beam (Google Dataflow). The pipeline is
ingesting data from PubSub subscription and writing them to Elasticsearch
index (using this Dataflow Template
<https://github.com/GoogleCloudPlatform/DataflowTemplates/tree/master/v2/pubsub-to-elasticsearch>).
This works fine.

What I am trying to do now is listen to multiple pubsub topics and write to
multiple Elasticsearch indices from one DataFLow job, I want to skip
creating multiple DataFlow jobs because the number of
pubsub-to-elasticsearch topics can grow fast in near future.

I am using Beam's ElasticsearchIO to write data from a single PubSub topic
to Elasticsearch, what I need is to change ElasticsearchIO to write to
multiple indices.

Is anyone familiar with similar architecture? What's the best approach for
this scenario?

Best,
-- 
*Arif Alili*

A:

T:
E:
I:


Pilotenstraat 43 bg
1059 CH Amsterdam
020 - 6 71 71 71 <+31+20+6+71+71+71>
a.alili@propellor.eu <[n...@propellor.eu>
www.propellor.eu


<https://www.facebook.com/PropellorEU/>
<https://www.linkedin.com/company/10870471>
<https://twitter.com/PropellorEU/>

Re: ElasticsearchIO - Read multiple PubSub topics and write to different Indices

Posted by Reuven Lax <re...@google.com>.
In that case, why can't you put them all in the same pipeline?

On Mon, Jan 4, 2021 at 12:12 AM Arif Alili <a....@propellor.eu> wrote:

> Yes, the set is known.
>
> Each PubSub topic should indexed in different indices in Elasticsearch.
>
> For example, I have:
> PubSub topics: topicA, topicB, topicC - that should be indexed as: IndexA,
> indexB, indexC in Elasthcsearch.
>
> On Sun, Jan 3, 2021 at 4:44 AM Reuven Lax <re...@google.com> wrote:
>
>> Do you know the set of PubSub topics when you launch your pipeline?
>>
>> On Mon, Dec 28, 2020 at 5:14 AM Arif Alili <a....@propellor.eu> wrote:
>>
>>> Hi all,
>>>
>>> I am writing to Elasticsearch using Beam (Google Dataflow). The pipeline
>>> is ingesting data from PubSub subscription and writing them to
>>> Elasticsearch index (using this Dataflow Template
>>> <https://github.com/GoogleCloudPlatform/DataflowTemplates/tree/master/v2/pubsub-to-elasticsearch>).
>>> This works fine.
>>>
>>> What I am trying to do now is listen to multiple pubsub topics and write
>>> to multiple Elasticsearch indices from one DataFLow job, I want to skip
>>> creating multiple DataFlow jobs because the number of
>>> pubsub-to-elasticsearch topics can grow fast in near future.
>>>
>>> I am using Beam's ElasticsearchIO to write data from a single PubSub
>>> topic to Elasticsearch, what I need is to change ElasticsearchIO to write
>>> to multiple indices.
>>>
>>> Is anyone familiar with similar architecture? What's the best approach
>>> for this scenario?
>>>
>>> Best,
>>> --
>>> *Arif Alili*
>>>
>>> A:
>>>
>>> T:
>>> E:
>>> I:
>>>
>>>
>>> Pilotenstraat 43 bg
>>> 1059 CH Amsterdam
>>> 020 - 6 71 71 71 <+31+20+6+71+71+71>
>>> a.alili@propellor.eu <[n...@propellor.eu>
>>> www.propellor.eu
>>>
>>>
>>> <https://www.facebook.com/PropellorEU/>
>>> <https://www.linkedin.com/company/10870471>
>>> <https://twitter.com/PropellorEU/>
>>>
>>
>
> --
> *Arif Alili*
>
> A:
>
> T:
> E:
> I:
>
>
> Pilotenstraat 43 bg
> 1059 CH Amsterdam
> 020 - 6 71 71 71 <+31+20+6+71+71+71>
> a.alili@propellor.eu <[n...@propellor.eu>
> www.propellor.eu
>
>
> <https://www.facebook.com/PropellorEU/>
> <https://www.linkedin.com/company/10870471>
> <https://twitter.com/PropellorEU/>
>

Re: ElasticsearchIO - Read multiple PubSub topics and write to different Indices

Posted by Arif Alili <a....@propellor.eu>.
Yes, the set is known.

Each PubSub topic should indexed in different indices in Elasticsearch.

For example, I have:
PubSub topics: topicA, topicB, topicC - that should be indexed as: IndexA,
indexB, indexC in Elasthcsearch.

On Sun, Jan 3, 2021 at 4:44 AM Reuven Lax <re...@google.com> wrote:

> Do you know the set of PubSub topics when you launch your pipeline?
>
> On Mon, Dec 28, 2020 at 5:14 AM Arif Alili <a....@propellor.eu> wrote:
>
>> Hi all,
>>
>> I am writing to Elasticsearch using Beam (Google Dataflow). The pipeline
>> is ingesting data from PubSub subscription and writing them to
>> Elasticsearch index (using this Dataflow Template
>> <https://github.com/GoogleCloudPlatform/DataflowTemplates/tree/master/v2/pubsub-to-elasticsearch>).
>> This works fine.
>>
>> What I am trying to do now is listen to multiple pubsub topics and write
>> to multiple Elasticsearch indices from one DataFLow job, I want to skip
>> creating multiple DataFlow jobs because the number of
>> pubsub-to-elasticsearch topics can grow fast in near future.
>>
>> I am using Beam's ElasticsearchIO to write data from a single PubSub
>> topic to Elasticsearch, what I need is to change ElasticsearchIO to write
>> to multiple indices.
>>
>> Is anyone familiar with similar architecture? What's the best approach
>> for this scenario?
>>
>> Best,
>> --
>> *Arif Alili*
>>
>> A:
>>
>> T:
>> E:
>> I:
>>
>>
>> Pilotenstraat 43 bg
>> 1059 CH Amsterdam
>> 020 - 6 71 71 71 <+31+20+6+71+71+71>
>> a.alili@propellor.eu <[n...@propellor.eu>
>> www.propellor.eu
>>
>>
>> <https://www.facebook.com/PropellorEU/>
>> <https://www.linkedin.com/company/10870471>
>> <https://twitter.com/PropellorEU/>
>>
>

-- 
*Arif Alili*

A:

T:
E:
I:


Pilotenstraat 43 bg
1059 CH Amsterdam
020 - 6 71 71 71 <+31+20+6+71+71+71>
a.alili@propellor.eu <[n...@propellor.eu>
www.propellor.eu


<https://www.facebook.com/PropellorEU/>
<https://www.linkedin.com/company/10870471>
<https://twitter.com/PropellorEU/>

Re: ElasticsearchIO - Read multiple PubSub topics and write to different Indices

Posted by Reuven Lax <re...@google.com>.
Do you know the set of PubSub topics when you launch your pipeline?

On Mon, Dec 28, 2020 at 5:14 AM Arif Alili <a....@propellor.eu> wrote:

> Hi all,
>
> I am writing to Elasticsearch using Beam (Google Dataflow). The pipeline
> is ingesting data from PubSub subscription and writing them to
> Elasticsearch index (using this Dataflow Template
> <https://github.com/GoogleCloudPlatform/DataflowTemplates/tree/master/v2/pubsub-to-elasticsearch>).
> This works fine.
>
> What I am trying to do now is listen to multiple pubsub topics and write
> to multiple Elasticsearch indices from one DataFLow job, I want to skip
> creating multiple DataFlow jobs because the number of
> pubsub-to-elasticsearch topics can grow fast in near future.
>
> I am using Beam's ElasticsearchIO to write data from a single PubSub topic
> to Elasticsearch, what I need is to change ElasticsearchIO to write to
> multiple indices.
>
> Is anyone familiar with similar architecture? What's the best approach for
> this scenario?
>
> Best,
> --
> *Arif Alili*
>
> A:
>
> T:
> E:
> I:
>
>
> Pilotenstraat 43 bg
> 1059 CH Amsterdam
> 020 - 6 71 71 71 <+31+20+6+71+71+71>
> a.alili@propellor.eu <[n...@propellor.eu>
> www.propellor.eu
>
>
> <https://www.facebook.com/PropellorEU/>
> <https://www.linkedin.com/company/10870471>
> <https://twitter.com/PropellorEU/>
>