You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@brpc.apache.org by GitBox <gi...@apache.org> on 2022/12/21 11:37:19 UTC

[GitHub] [incubator-brpc] fzhedu opened a new issue, #2052: 传输大量数据必须用butil::IOBuf 吗?

fzhedu opened a new issue, #2052:
URL: https://github.com/apache/incubator-brpc/issues/2052

   每个 chunk 的 size 在 100KB ~ 100MB 之间,chunk 先序列化成 protobuf chunkPB。
   Q1: chunkPB 有必要 copy 到 butil::IOBuf 中通过 attachment 传输吗?还是直接传输 chunkPB 结构?这两种方式哪种更高效?
   (我理解两种方式都可以,直接传输chunkPB 更高效)
   
   Q2:调用 rpc,发送端和接受端的 chunkPB 的生命周期?
   chunk_rpc(chunkPB,closure) 这样的接口,在发送端,发送完之后,chunkPB 的内存是否可以释放?接受端的 chunkPB 是不是在 closure 调用之前都有效?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org
For additional commands, e-mail: dev-help@brpc.apache.org


[GitHub] [incubator-brpc] leaf-potato commented on issue #2052: 传输大量数据必须用butil::IOBuf 吗?

Posted by GitBox <gi...@apache.org>.
leaf-potato commented on issue #2052:
URL: https://github.com/apache/incubator-brpc/issues/2052#issuecomment-1369936547

   > Q3:通过 perf,我们看到 bthread 中的 sys call 占总体 sys call 的 20% 左右,不知道是我们使用姿势有问题,或者是有优化参数可以控制?
   
   可以参考 [高效率排查server卡顿](https://github.com/apache/incubator-brpc/blob/master/docs/cn/server_debugging.md)文档先自行排查下。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org
For additional commands, e-mail: dev-help@brpc.apache.org


[GitHub] [incubator-brpc] leaf-potato commented on issue #2052: 传输大量数据必须用butil::IOBuf 吗?

Posted by GitBox <gi...@apache.org>.
leaf-potato commented on issue #2052:
URL: https://github.com/apache/incubator-brpc/issues/2052#issuecomment-1364497672

   重新看了下文档,`chunk_PB`在使用异步请求时,`CallMethod`结束后可以释放,前提在done中不需要`chunkPB`信息
   ![image](https://user-images.githubusercontent.com/49188857/209429847-0bd187ea-928f-442e-ac07-b4318e6a3423.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org
For additional commands, e-mail: dev-help@brpc.apache.org


[GitHub] [incubator-brpc] leaf-potato commented on issue #2052: 传输大量数据必须用butil::IOBuf 吗?

Posted by GitBox <gi...@apache.org>.
leaf-potato commented on issue #2052:
URL: https://github.com/apache/incubator-brpc/issues/2052#issuecomment-1375721168

   目前似乎没看到相关的文档或者测试数据,可以问下仓库的管理员~ 简单说下个人的理解:
   
   1. 使用proto设置reponse:用户设置response => 序列化数据(框架) => 发送
   2. attament传序列化数据:用户设置reponse => 序列化数据(用户) => 发送
   
   可以看到两种方式都需要存储序列化后的数据,区别在于序列化的地方不同。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org
For additional commands, e-mail: dev-help@brpc.apache.org


[GitHub] [brpc] fzhedu commented on issue #2052: 传输大量数据必须用butil::IOBuf 吗?

Posted by GitBox <gi...@apache.org>.
fzhedu commented on issue #2052:
URL: https://github.com/apache/brpc/issues/2052#issuecomment-1378169591

   @leaf-potato 感谢您的回复。
   通过微信官方群确认,append 和 append_user_data 将用户数据弄到 iobuf 都不需要序列化,并且用后者更高效,因为它避免 copy。
   再次感谢您的耐心解答。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org
For additional commands, e-mail: dev-help@brpc.apache.org


[GitHub] [incubator-brpc] leaf-potato commented on issue #2052: 传输大量数据必须用butil::IOBuf 吗?

Posted by GitBox <gi...@apache.org>.
leaf-potato commented on issue #2052:
URL: https://github.com/apache/incubator-brpc/issues/2052#issuecomment-1373070266

   > 也就是发送数据量 >=2G 的时候在发送端的 copy 无法消除,其他的 copy 都可以消除,这样可以更高效。
   
   序列化数据拷贝到attachment可以看下IOBuf的`append_user_data`方法是否可以解决
   
   Q3:可以参考[高效率排查server卡顿](https://github.com/apache/incubator-brpc/blob/master/docs/cn/server_debugging.md)文档先进行排查
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org
For additional commands, e-mail: dev-help@brpc.apache.org


[GitHub] [brpc] fzhedu closed issue #2052: 传输大量数据必须用butil::IOBuf 吗?

Posted by GitBox <gi...@apache.org>.
fzhedu closed issue #2052: 传输大量数据必须用butil::IOBuf 吗?
URL: https://github.com/apache/brpc/issues/2052


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org
For additional commands, e-mail: dev-help@brpc.apache.org


[GitHub] [incubator-brpc] leaf-potato commented on issue #2052: 传输大量数据必须用butil::IOBuf 吗?

Posted by GitBox <gi...@apache.org>.
leaf-potato commented on issue #2052:
URL: https://github.com/apache/incubator-brpc/issues/2052#issuecomment-1363987901

   Q1:在client、server两端都是brpc的前提下,attachment更算是proto message的一种补充。proto官方给出大小是有限制的,默认是64MB,硬制是2GB。在限制范围内,attachment和proto的区别在于序列化和反序列化,如果attachment是传proto序列化后的数据,理论上两者差距不大。
   
   Q2:理论上chunkPB是要大于rpc请求的生命周期
   1. 发送端如何去判断请求已经发送完了呢?同步请求rpc server,_stub函数返回时整个rpc交互已经完成收到response信息;异步请求rpc,_stub函数返回只代表请求提交给了rpc框架,结束时调用用户的回调函数;brpc应该没提供接口告诉用户当前请求已经发送了。
   2. 在调closure前rpc并未结束,chunkPB都是有效的。
   
   个人看法,如有错误还请指教~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org
For additional commands, e-mail: dev-help@brpc.apache.org


[GitHub] [incubator-brpc] fzhedu commented on issue #2052: 传输大量数据必须用butil::IOBuf 吗?

Posted by GitBox <gi...@apache.org>.
fzhedu commented on issue #2052:
URL: https://github.com/apache/incubator-brpc/issues/2052#issuecomment-1369506674

   @leaf-potato 感谢您的回复。Q2 已经没疑问了,但是 Q1 中关于用 iobuf 和直接 chunkPB 传输的性能方面还有疑问,因为我们传输的是 chunkPB list,它由多个 chunkPB 组成,可以直接传输 chunkPB list。如果用IObuf,现有 Starrocks 中将 chunkPB list 中的每个 chunkPB copy 到 IOBuf 中进行发送;接受端则需要从 IObuf 中 copy 出 chunkPB。在发送和接受端各有一次 copy,这个代价使得 IObuf 应该会慢一些。同时考虑到 IOBuf 不限制发送数据量的优势,因此我们做出一些优化:
   1. 针对接受端,使用 cut() 函数避免 copy,直接引用 iobuf 中的数据来 deserialize,避免原来的 copy;
   2. 针对发送端,当发送数据量< 2G,则直接传 serialize 的 chunkPB list,不然则 copy 到 iobuf。
   
   也就是发送数据量 >=2G 的时候在发送端的 copy 无法消除,其他的 copy 都可以消除,这样可以更高效。
   
   此外还有个问题请教一下:
   
   Q3:通过 perf,我们看到 bthread 中的 sys call 占总体 sys call 的 20% 左右,不知道是我们使用姿势有问题,或者是有优化参数可以控制?
   
   
   ![image](https://user-images.githubusercontent.com/6490813/210322342-3b8dc18b-8774-47b7-bf6b-09594ddb5b24.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org
For additional commands, e-mail: dev-help@brpc.apache.org


[GitHub] [incubator-brpc] fzhedu commented on issue #2052: 传输大量数据必须用butil::IOBuf 吗?

Posted by GitBox <gi...@apache.org>.
fzhedu commented on issue #2052:
URL: https://github.com/apache/incubator-brpc/issues/2052#issuecomment-1375702075

   @leaf-potato 谢谢你的指点,`append_user_data ` 确实可以避免大chunk copy 到 attachment 的问题。 Q3 我正在学习中。
   针对 Q1 还有疑问,不理解你上面回复
   ```
   在限制范围内,attachment和proto的区别在于序列化和反序列化,如果attachment是传proto序列化后的数据,理论上两者差距不大。
   ```
   针对`attachment传proto序列化后的数据,理论上两者差距不大`,是不是因为采用 `append_user_data `没有复制数据,所以跟直接传proto序列化后的数据效果差不多?如果采用 `append` 复制proto序列化后的数据到 iobuf,就应该会慢一些?
   
   这一块有官方文档说明么?或者有没有实验数据。
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org
For additional commands, e-mail: dev-help@brpc.apache.org


[GitHub] [incubator-brpc] leaf-potato commented on issue #2052: 传输大量数据必须用butil::IOBuf 吗?

Posted by GitBox <gi...@apache.org>.
leaf-potato commented on issue #2052:
URL: https://github.com/apache/incubator-brpc/issues/2052#issuecomment-1369923528

   > 也就是发送数据量 >=2G 的时候在发送端的 copy 无法消除,其他的 copy 都可以消除,这样可以更高效。
   
   我理解不管使用PB还是attachment发送数据都会涉及到一次拷贝,只是这个拷贝发生的地方不同:
   1. PB发送数据:用户设置PB => PB序列化(框架)=> 发送。拷贝发生在框架拿到PB序列化的结果。
   2. attachment:用户PB序列化 => attachment => 发送。拷贝发生在用户序列化PB给attachment赋值。
   
   > 在限制范围内,attachment和proto的区别在于序列化和反序列化,如果attachment是传proto序列化后的数据,理论上两者差距不大。
   
   综上,用户更多考虑的是如何给PB赋值时减少拷贝,比方使用PB的Swap方法进行数据交换等。最终得@wwbmmm来double check下避免得出错误结论~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@brpc.apache.org
For additional commands, e-mail: dev-help@brpc.apache.org