You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/06/06 12:57:18 UTC

[jira] [Commented] (BEAM-991) DatastoreIO Write should flush early for large batches

    [ https://issues.apache.org/jira/browse/BEAM-991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16038827#comment-16038827 ] 

ASF GitHub Bot commented on BEAM-991:
-------------------------------------

GitHub user cph6 opened a pull request:

    https://github.com/apache/beam/pull/3302

    [BEAM-991] Raise entities limit per RPC to 9MB.

    This is closer to the API limit, while still leaving room for overhead, and brings
    the Java SDK back into line with the Python SDK.
    
    Switch the unit test to use the size of each entity, which is what the
    connector is actually using, rather than the property size (which is slightly
    smaller and would cause the test to fail for some values).
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cph6/beam datastore_request_size_limit_java.2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/3302.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3302
    
----
commit 356597db6e368fd1372d8fcac311b1ea48dbc00f
Author: Colin Phipps <fi...@google.com>
Date:   2017-06-05T12:12:49Z

    Raise entity limit per RPC to 9MB.
    
    This is closer to the API limit, while still leaving room for overhead. Brings
    the Java SDK into line with the Python SDK.
    
    Switch the unit test to use the size of each entity, which is what the
    connector is actually using, rather than the property size (which is slightly
    smaller and would cause the test to fail for some values).

----


> DatastoreIO Write should flush early for large batches
> ------------------------------------------------------
>
>                 Key: BEAM-991
>                 URL: https://issues.apache.org/jira/browse/BEAM-991
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-gcp
>            Reporter: Vikas Kedigehalli
>            Assignee: Vikas Kedigehalli
>
> If entities are large (avg size > 20KB) then the a single batched write (500 entities) would exceed the Datastore size limit of a single request (10MB) from https://cloud.google.com/datastore/docs/concepts/limits.
> First reported in: http://stackoverflow.com/questions/40156400/why-does-dataflow-erratically-fail-in-datastore-access



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)