You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@reef.apache.org by "Dhruv Mahajan (JIRA)" <ji...@apache.org> on 2015/02/27 20:15:04 UTC
[jira] [Commented] (REEF-179) Solve the large memory footprint problem due to multiple serializations during call to an operator

    [ https://issues.apache.org/jira/browse/REEF-179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340641#comment-14340641 ] 

Dhruv Mahajan commented on REEF-179:
------------------------------------

Putting an email conversation I had regarding this with Beysim on this.



Note that for some of basic and most frequent structures needed for ML like double, float, integer arrays, we are going to write default functions and can easily determine their sizes. Hence, one can even follow the route where for the provided codecs and structures for communication we do optimized operations, while for others we go with defaults.

Dhruv

From: Dhruv Mahajan 
Sent: Thursday, February 26, 2015 4:00 PM
To: Beysim Sezgin; Julia Wang (QIUHE)
Cc: Markus Weimer
Subject: RE: Pre determining the size of object instance in bytes

Basically at the runtime before I apply the Codec to the object on which I want to apply MPI operator, I would like to know the number of bytes required for that object after serialization. If I know it beforehand, it can save me the extra memory footprint I need  since later on when I need to add headers like total message length to the message, I can allocate the whole memory required in one go and do not have to do it in two steps as being done now.

However, it seems like a difficult things to do, since when user gives us  a customized codec I do not think we can predetermine the number of bytes generated. We might have to do a trade-off between computation vs. memory here: where user needs to supply an additional function that just computes the number of bytes first. However, this is extra effort for the user.

Dhruv

From: Beysim Sezgin 
Sent: Thursday, February 26, 2015 9:56 AM
To: Dhruv Mahajan; Julia Wang (QIUHE)
Cc: Markus Weimer
Subject: RE: Pre determining the size of object instance in bytes

Hi Dhruv,
When do you need to know the size. 
If runtime, https://msdn.microsoft.com/en-us/library/y3ybkfb3(v=vs.110).aspx
If compile time, need to annotate the type with https://msdn.microsoft.com/en-us/library/vstudio/system.runtime.interopservices.structlayoutattribute(v=vs.100).aspx
Regards.

From: Dhruv Mahajan 
Sent: Wednesday, February 25, 2015 8:10 PM
To: Beysim Sezgin; Julia Wang (QIUHE)
Cc: Markus Weimer
Subject: Pre determining the size of object instance in bytes

Hi

So I was looking at the larger memory footprint issue due to 3 serialization steps that occur before actually sending a message. 

Is there a way we can pre-determine the size in bytes of a class instance. If yes, then then the serialization and hence memory footprint can be reduced to 1 step otherwise we might have to live with 2.

Dhruv


> Solve the large memory footprint problem due to multiple serializations during call to an operator
> --------------------------------------------------------------------------------------------------
>
>                 Key: REEF-179
>                 URL: https://issues.apache.org/jira/browse/REEF-179
>             Project: REEF
>          Issue Type: Improvement
>          Components: REEF-IO, REEF.NET, Wake
>            Reporter: Dhruv Mahajan
>            Assignee: Shravan Matthur Narayanamurthy
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Right now  during each communication call, there are three serialization steps happening where actual data/message to pass is encoded again and again leading to larger memory footprints. We would like to optimize it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)