You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Niel Markwick (Jira)" <ji...@apache.org> on 2019/11/27 15:39:00 UTC

[jira] [Assigned] (BEAM-8825) OOM when writing large numbers of 'narrow' rows

     [ https://issues.apache.org/jira/browse/BEAM-8825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Niel Markwick reassigned BEAM-8825:
-----------------------------------

    Assignee: Niel Markwick

> OOM when writing large numbers of 'narrow' rows
> -----------------------------------------------
>
>                 Key: BEAM-8825
>                 URL: https://issues.apache.org/jira/browse/BEAM-8825
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp
>    Affects Versions: 2.9.0, 2.10.0, 2.11.0, 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0
>            Reporter: Niel Markwick
>            Assignee: Niel Markwick
>            Priority: Major
>             Fix For: 2.18.0
>
>
> SpannerIO can OOM when writing large numbers of 'narrow' rows. 
>  
> SpannerIO puts  input mutation elements into batches for efficient writing.
> These batches are limited by number of cells mutated, and size of data written (5000 cells, 1MB data). SpannerIO groups enough mutations to build 1000 of these groups (5M cells, 1GB data), then sorts and batches them.
> When the number of cells and size of data is very small (<5 cells, <100 bytes), the memory overhead of storing millions of mutations for batching is significant, and can lead to OOMs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)