You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Frank McQuillan (JIRA)" <ji...@apache.org> on 2019/06/14 16:23:00 UTC
[jira] [Closed] (MADLIB-1337) DL: Better warning and default for
gpu memory fraction when no of gpus < no of segments
[ https://issues.apache.org/jira/browse/MADLIB-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Frank McQuillan closed MADLIB-1337.
-----------------------------------
Resolution: Fixed
https://github.com/apache/madlib/pull/412
> DL: Better warning and default for gpu memory fraction when no of gpus < no of segments
> ---------------------------------------------------------------------------------------
>
> Key: MADLIB-1337
> URL: https://issues.apache.org/jira/browse/MADLIB-1337
> Project: Apache MADlib
> Issue Type: Improvement
> Components: Deep Learning
> Reporter: Nikhil
> Priority: Minor
> Fix For: v1.16
>
>
> We support the use case when no of gpus < no of segments however we noticed that sometimes this causes gpdb failures like
> {code:java}
> could not connect to segment: initialization of segworker group failed
> {code}
> # We should give a meaningful warning to the user to make them aware that this feature may or may not work and also make a recommendation
> # We should also come up with a better heuristic for the memory fraction value. Currently we default to using 90% of the available memory and distribute it evenly among the segments.
> Possible recommendations
> 1. Use as many gpus as segments (this may not be practical)
> 2. May be a smaller buffer size will help. Use minibatch preprocessor dl to pack less images. (we need to test this before we recommend it)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)