You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Philipp Moritz (JIRA)" <ji...@apache.org> on 2017/08/25 18:47:00 UTC
[jira] [Resolved] (ARROW-1410) Plasma object store occasionally
pauses for a long time
[ https://issues.apache.org/jira/browse/ARROW-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Philipp Moritz resolved ARROW-1410.
-----------------------------------
Resolution: Fixed
Issue resolved by pull request 992
[https://github.com/apache/arrow/pull/992]
> Plasma object store occasionally pauses for a long time
> -------------------------------------------------------
>
> Key: ARROW-1410
> URL: https://issues.apache.org/jira/browse/ARROW-1410
> Project: Apache Arrow
> Issue Type: Improvement
> Environment: Ubuntu 16.04
> Reporter: Robert Nishihara
> Assignee: Robert Nishihara
>
> The problem can be reproduced as follows. First start a plasma store with
> {code}
> plasma_store -s /tmp/s1 -m 500000000000
> {code}
> Then continuously put in objects using a script like the following.
> {code}
> import pyarrow.plasma as plasma
> import numpy as np
> client = plasma.connect('/tmp/s1', '', 0)
> for i in range(20000):
> print(i)
> object_id = plasma.ObjectID(np.random.bytes(20))
> client.create(object_id, np.random.randint(0, 100000000))
> client.seal(object_id)
> {code}
> As the loop counters are being printed, you will see long pauses. The problem is the fact that we are mmapping pages with the MAP_POPULATE flag. Though this can be used to improve performance of subsequent object creations, it isn't worth the long pauses. We may want to find a way to populate the pages in the background.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)