You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/08/24 13:11:47 UTC

[GitHub] [beam] egalpin commented on issue #22840: [Feature Request]: ElasticSearchIO.Write to allow adding context to input json

egalpin commented on issue #22840:
URL: https://github.com/apache/beam/issues/22840#issuecomment-1225707383

   @sheepdreamofandroids Something that exists today which might work for you would be to split apart your usage of the building blocks of `ElasticsearchIO#Write`:  `DocToBulk`[1] and `BulkIO`[2].  DocToBulk is responsible for taking JSON-serialized inputs and converting them to a representation that the ES Bulk API can work with. BulkIO strictly deals with batching and sending data to an ES cluster.
   
   In your use case, it sounds like you have a singular input PCollection of inputs which then need to fanout in order to be processed in multiple ways for inclusion in different indices in ES.  You could fanout to each DocToBulk to process as needed, then flatten and use a single BulkIO operation.  This would allow for larger Bulk API payloads/larger batches because outputs from all DocToBulk could be combined in a single BulkIO output to ES (depending on buffering time, of course).
   
   ```
   
   
   
   
   
                                                                         ┌──────────────────────┐
                                                                         │                      │
                                                                         │ Input PCollection    │
                     ┌─────────────────────────────┬─────────────────────┴──────────────────────┴─────────────────────────────┐
                     │                             │                                                                          │
                     │                             │                                                                          │
                     │                             │                                                                          │
                     │                             │                                                                          │
                     │                             │                                                                          │
          ┌──────────▼──────────┐         ┌────────▼──────┐                                                          ┌────────▼────────┐
          │  DocToBulk1         │         │  DocToBulk2   │                                                          │ DocToBulk_n     │
          └────────┬────────────┘         └───────────────┴───────────────────┐                                      └────────┬────────┘
                   │                                                          │                                               │
                   │                                                          │                                               │
                   │                                                          │                                               │
                   │                                                          │                                               │
                   │                                                          │                                               │
                   │                                                          │                                               │
                   │                                            ┌─────────────▼─────────────┐                                 │
                   └───────────────────────────────────────────►│    Flatten                ◄─────────────────────────────────┘
                                                                │                           │
                                                                └─────────────┬─────────────┘
                                                                              │
                                                                              │
                                                                              │
                                                                ┌─────────────▼─────────────┐
                                                                │       BulkIO              │
                                                                └───────────────────────────┘
   
   
   
   ```
   
   [1] https://beam.apache.org/releases/javadoc/2.40.0/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.DocToBulk.html
   [2] https://beam.apache.org/releases/javadoc/2.40.0/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.BulkIO.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org