You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@teaclave.apache.org by GitBox <gi...@apache.org> on 2020/06/22 12:17:23 UTC

[GitHub] [incubator-teaclave] ly137062117 opened a new issue #368: function中支持对输入文件进行流式读写吗?

ly137062117 opened a new issue #368:
URL: https://github.com/apache/incubator-teaclave/issues/368


   最近在teaclave中执行gbdt训练时,发现对大样本文件(大约1.7G)进行训练时占用内存十分巨大。
   因此想请问下,在利用Runtime获取到输入文件的reader和输出文件的writer时,reader.lines() 是将整个文件内容加载到内存中,进行操作,还是流式操作的呢?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@teaclave.apache.org
For additional commands, e-mail: notifications-help@teaclave.apache.org


[GitHub] [incubator-teaclave] mssun commented on issue #368: function中支持对输入文件进行流式读写吗?

Posted by GitBox <gi...@apache.org>.
mssun commented on issue #368:
URL: https://github.com/apache/incubator-teaclave/issues/368#issuecomment-647784381


   `BufReader` 实现了 `BufRead` trait,就是所谓的 “流式操作”。
   
   内存使用可能是其他问题造成的,比如说 samples:
   
   https://github.com/apache/incubator-teaclave/blob/0316757e2dfe748185b84d1bff5ae04701b46c8f/function/src/gbdt_train.rs#L135
   
   或者其他问题引起的,需要详细 review/profile 代码。


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@teaclave.apache.org
For additional commands, e-mail: notifications-help@teaclave.apache.org


[GitHub] [incubator-teaclave] ly137062117 commented on issue #368: function中支持对输入文件进行流式读写吗?

Posted by GitBox <gi...@apache.org>.
ly137062117 commented on issue #368:
URL: https://github.com/apache/incubator-teaclave/issues/368#issuecomment-647857205


   @mssun 那请问下,在 tee 里边,通过 BufWriter 包装了 io::Write 也能实现流式地写吗?那这种情况下, tee 对输出文件中的内容进行加密的机制是逐行写入,逐行加密吗?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@teaclave.apache.org
For additional commands, e-mail: notifications-help@teaclave.apache.org


[GitHub] [incubator-teaclave] litongxin1991 commented on issue #368: function中支持对输入文件进行流式读写吗?

Posted by GitBox <gi...@apache.org>.
litongxin1991 commented on issue #368:
URL: https://github.com/apache/incubator-teaclave/issues/368#issuecomment-647873489


   关于内存占用的问题,主要是gbdt-rs算法实现引起的。在进行训练时,gbdt-rs需要使用所有的数据进行计算。设置不同的`training_optimization_level`在训练时会有不同的内存访问模式和内部数据表达,占用的内存也会不一样。其中,`training_optimization_level=0或1`使用的内存会比`training_optimization_level=2`小。


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@teaclave.apache.org
For additional commands, e-mail: notifications-help@teaclave.apache.org


[GitHub] [incubator-teaclave] mssun closed issue #368: function中支持对输入文件进行流式读写吗?

Posted by GitBox <gi...@apache.org>.
mssun closed issue #368:
URL: https://github.com/apache/incubator-teaclave/issues/368


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@teaclave.apache.org
For additional commands, e-mail: notifications-help@teaclave.apache.org


[GitHub] [incubator-teaclave] mssun commented on issue #368: function中支持对输入文件进行流式读写吗?

Posted by GitBox <gi...@apache.org>.
mssun commented on issue #368:
URL: https://github.com/apache/incubator-teaclave/issues/368#issuecomment-652115258


   I'm closing this issue. Feel free to reopen or create a new one if you have further questions. Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@teaclave.apache.org
For additional commands, e-mail: notifications-help@teaclave.apache.org


[GitHub] [incubator-teaclave] mssun commented on issue #368: function中支持对输入文件进行流式读写吗?

Posted by GitBox <gi...@apache.org>.
mssun commented on issue #368:
URL: https://github.com/apache/incubator-teaclave/issues/368#issuecomment-647861215


   Teaclave execution service 使用的 secure file system 基于 `protected_fs`, (https://github.com/apache/incubator-teaclave/blob/master/common/protected_fs_rs/src/sgx_tprotected_fs.rs) 提供了 POSIX compatible 的 file I/O 接口。
   
   对于加密方式,不是“逐行写入,逐行加密”,而是按照 block 进行,提供了 LRU cache。
   
   如果想了解更多,可以参考 protected fs 的代码:https://github.com/apache/incubator-teaclave/tree/master/common/protected_fs_rs/protected_fs_c/sgx_tprotected_fs


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@teaclave.apache.org
For additional commands, e-mail: notifications-help@teaclave.apache.org