You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/06/21 08:17:00 UTC

[jira] [Commented] (BEAM-2488) Elasticsearch IO should read also in replica shards

    [ https://issues.apache.org/jira/browse/BEAM-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16057158#comment-16057158 ] 

ASF GitHub Bot commented on BEAM-2488:
--------------------------------------

GitHub user echauchot opened a pull request:

    https://github.com/apache/beam/pull/3410

    [BEAM-2488] Elasticsearch IO should read also in replica shards

    Be sure to do all of the following to help us incorporate your contribution
    quickly and easily:
    
     - [X] Make sure the PR title is formatted like:
       `[BEAM-<Jira issue #>] Description of pull request`
     - [X] Make sure tests pass via `mvn clean verify`.
     - [X] Replace `<Jira issue #>` in the title with the actual Jira issue
           number, if there is one.
     - [X] If this contribution is large, please file an Apache
           [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
    
    ---
    R: @jbonofre 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/echauchot/beam BEAM-2488-ELASTICSEARCHIO-SHARDS

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/3410.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3410
    
----
commit 6f6c09f0c8232afd68a9c6c26d48e4ac7bb226ce
Author: Etienne Chauchot <ec...@gmail.com>
Date:   2017-06-21T08:14:08Z

    [BEAM-2488] Elasticsearch IO should read also in replica shards

----


> Elasticsearch IO should read also in replica shards
> ---------------------------------------------------
>
>                 Key: BEAM-2488
>                 URL: https://issues.apache.org/jira/browse/BEAM-2488
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-extensions
>            Reporter: Etienne Chauchot
>            Assignee: Etienne Chauchot
>
> To avoid duplication of data ElasticsearchIO reads from primary shards only and filters out replica shards. But in reality, even if _shard-preference:shardId is set in scroll request, ES internally load balances requests between primary and replica shards and ensures that there will be no duplicates. Targeting all the shards and letting ES deal with replicas is better in some corner cases like failover.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)