You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2017/07/28 01:12:00 UTC

[jira] [Comment Edited] (ARROW-1282) Large memory reallocation by Arrow causes hang in jemalloc

    [ https://issues.apache.org/jira/browse/ARROW-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104263#comment-16104263 ] 

Wes McKinney edited comment on ARROW-1282 at 7/28/17 1:11 AM:
--------------------------------------------------------------

Oh, that's a bug that I introduced recently, and would have been caught if we were benchmarking more rigorously. We should be doubling the buffer size when we run out of capacity in BufferBuilder instead of only expanding to fix the next-appended bit. I think this flew under the radar when I changed {{BinaryBuilder}} from using a {{UInt8Builder}} under the hood (which does array doubling to grow the arrays). 

I opened https://issues.apache.org/jira/browse/ARROW-1290. I am betting that this will reduce the number of calls to {{buffer->Resize}} in BufferBuilder and maybe make this problem go away for you

Seems we may want to do Arrow 0.6.0 very soon including this fix as soon we can get the Plasma code ready to ship. 


was (Author: wesmckinn):
Oh, that's a bug that I introduced recently, and would have been caught if we were benchmarking more rigorously. We should be doubling the buffer size when we run out of capacity in BufferBuilder instead of only expanding to fix the next-appended bit. I think this flew under the radar when I changed {{BinaryBuilder}} from using a {{UInt8Builder}} under the hood (which does array doubling to grow the arrays). 

I opened https://issues.apache.org/jira/browse/ARROW-1290. I am better that this will reduce the number of calls to {{buffer->Resize}} in BufferBuilder and maybe make this problem go away for you

Seems we may want to do Arrow 0.6.0 very soon including this fix as soon we can get the Plasma code ready to ship. 

> Large memory reallocation by Arrow causes hang in jemalloc
> ----------------------------------------------------------
>
>                 Key: ARROW-1282
>                 URL: https://issues.apache.org/jira/browse/ARROW-1282
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: Jeff Knupp
>             Fix For: 0.6.0
>
>
> When reallocating a large amount of memory, Arrow is either triggering a bug in jemalloc or has a bug itself in the memory manager (many different applications reporting same issue but not clear from jemalloc issue description if they're sure it's in jemalloc or caused by other issues like using multiple memory allocation libraries in the same process, multithreaded access, etc).
> Link to stack trace is here: https://gist.github.com/jeffknupp/73879feacf9c560afd4f1a20213dc6ef
> Link to issue in jemalloc GitHub is here: https://github.com/jemalloc/jemalloc/issues/802
> Originally observed in redis, discussed with jemalloc maintainer here: https://github.com/antirez/redis/issues/3799
> *This is entirely reproducible on Ubuntu 16.04 xenial, which uses version 3.6.0 according to `apt` metadata.*



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)