You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by pa...@apache.org on 2023/04/04 09:10:44 UTC

[doris] branch master updated: [doc](developer-guide) add some debug tricks to dev-guide (#18225)

This is an automated email from the ASF dual-hosted git repository.

panxiaolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
     new d7623028e9 [doc](developer-guide) add some debug tricks to dev-guide (#18225)
d7623028e9 is described below

commit d7623028e9623db5727da83247993a03415ecb63
Author: zclllyybb <zh...@selectdb.com>
AuthorDate: Tue Apr 4 17:10:34 2023 +0800

    [doc](developer-guide) add some debug tricks to dev-guide (#18225)
    
    add method to debug core-dump file in vscode. and some BE debug tricks.
---
 docs/en/community/developer-guide/be-vscode-dev.md | 33 ++++++++++++++++++--
 .../community/developer-guide/be-vscode-dev.md     | 35 ++++++++++++++++++++--
 2 files changed, 63 insertions(+), 5 deletions(-)

diff --git a/docs/en/community/developer-guide/be-vscode-dev.md b/docs/en/community/developer-guide/be-vscode-dev.md
index db02f6df3e..f2e65ac483 100644
--- a/docs/en/community/developer-guide/be-vscode-dev.md
+++ b/docs/en/community/developer-guide/be-vscode-dev.md
@@ -201,10 +201,10 @@ Among them, environment defines several environment variables DORIS_HOME UDF_RUN
 }
 ```
 
-In the configuration **"request": "attach", "processId": PID**, these two configurations are the key points: set the debug mode of gdb to attach and attach the processId of the process, otherwise it will fail. To find the process id, you can enter the following command in the command line:
+In the configuration **"request": "attach", "processId": PID**, these two configurations are the key points: set the debug mode of gdb to attach and attach the processId of the process, otherwise it will fail. The command below can directly extract the `pid` of doris' BE:
 
 ```
-ps -ef | grep palo*
+lsof -i | grep -m 1 doris_be | awk "{print $2}"
 ```
 
 Or write **"processId": "${command:pickProcess}"** to specify the pid when starting attach.
@@ -296,3 +296,32 @@ lldb's attach mode is faster than gdb,and the usage is similar to gdb. we shou
 ```
 
 It should be noted that this method requires the system `glibc` version to be `2.18+`. you can refer [Get VSCode CodeLLDB plugin work on CentOS 7](https://gist.github.com/JaySon-Huang/63dcc6c011feb5bd6deb1ef0cf1a9b96) to make plugin work。
+
+## Debugging core dump files
+
+Sometimes we need to debug the core files generated by a crash, which can also be done in vscode, by adding the corresponding configuration item
+```json
+    "coreDumpPath": "/PATH/TO/CORE/DUMP/FILE"
+```
+and you're done.
+
+## Common debugging techniques
+
+### Function execution paths
+
+When you are not familiar with the details of BE execution, you can trace function calls and find out the call chain using relevant tools such as `perf`. `perf` can be used in [Debug Tool](./debug-tool.md). At this point we can execute the sql statement to be traced on a larger table and then increase the sampling frequency (e.g., `perf -F 999`). Observe the results to get a rough idea of the critical path of sql execution at BE.
+
+### Debugging CRTP objects
+
+BE code uses a lot of CRTP (singular recursive template pattern) in the base types in order to improve the efficiency of operation, which makes it impossible for the debugger to debug objects according to the derived types. In this case we can use GDB to solve this problem in this way:
+
+Suppose we need to debug an object ``col`` of type ``IColumn`` and do not know its actual type, then we can:
+
+```powershell
+set print object on # Output the object as a derived type
+p *col.t # Use col.t in this case to get the exact type of col
+p col.t->size() # You can use it according to the derived type, e.g. ColumnString we can call size()
+......
+```
+
+Note: it is the pointer `COW::t` that has the effect of polymorphism and not the `IColumn` class object, so we need to replace all uses of `col` with `col.t` in the GDB to actually get the derived type object.
diff --git a/docs/zh-CN/community/developer-guide/be-vscode-dev.md b/docs/zh-CN/community/developer-guide/be-vscode-dev.md
index 338413dc38..418d629be1 100644
--- a/docs/zh-CN/community/developer-guide/be-vscode-dev.md
+++ b/docs/zh-CN/community/developer-guide/be-vscode-dev.md
@@ -200,10 +200,10 @@ mkdir -p /soft/be/storage
 }
 ```
 
-配置中 **"request": "attach", "processId":PID**,这两个配置是重点: 分别设置 gdb 的调试模式为 attach,附加进程的 processId,否则会失败。如何查找进程 id,可以在命令行中输入以下命令:
+配置中 **"request": "attach", "processId":PID**,这两个配置是重点: 分别设置 gdb 的调试模式为 attach,附加进程的 processId,否则会失败。以下命令可以直接提取进程ID:
 
 ```
-ps -ef | grep palo*
+lsof -i | grep -m 1 doris_be | awk "{print $2}"
 ```
 
 或者写作 **"processId": "${command:pickProcess}"**,可在启动attach时指定pid.
@@ -293,4 +293,33 @@ lldb的attach比gdb更快,使用方式和gdb类似。vscode需要安装的插
     "pid":"${command:pickMyProcess}"
 }
 ```
-需要注意的是,此方式要求系统`glibc`版本为`2.18+`。如果没有则可以参考 [如何使CodeLLDB在CentOS7下工作](https://gist.github.com/JaySon-Huang/63dcc6c011feb5bd6deb1ef0cf1a9b96) 安装高版本glibc并将其链接到插件。
\ No newline at end of file
+需要注意的是,此方式要求系统`glibc`版本为`2.18+`。如果没有则可以参考 [如何使CodeLLDB在CentOS7下工作](https://gist.github.com/JaySon-Huang/63dcc6c011feb5bd6deb1ef0cf1a9b96) 安装高版本glibc并将其链接到插件。
+
+## 调试core dump文件
+
+有时我们需要调试程序崩溃产生的core文件,这同样可以在vscode中完成,此时只需要在对应的configuration项中添加
+```json
+    "coreDumpPath": "/PATH/TO/CORE/DUMP/FILE"
+```
+即可。
+
+## 常用调试技巧
+
+### 函数执行路径
+
+当对BE的执行细节不熟悉时,可以使用`perf`等相关工具追踪函数调用,找出调用链。`perf`的使用可以在[调试工具](./debug-tool.md)中找到。这时候我们可以在较大的表上执行需要追踪的sql语句,然后增大采样频率(例如,`perf -F 999`)。观察结果可以大致得到sql在BE执行的关键路径。
+
+### 调试CRTP对象
+
+BE代码为了提高运行效率,在基础类型中大量采用了CRTP(奇异递归模板模式),导致debugger无法按照派生类型调试对象。此时我们可以使用GDB这样解决这一问题:
+
+假设我们需要调试`IColumn`类型的对象`col`,不知道它的实际类型,那么可以:
+
+```powershell
+set print object on # 按照派生类型输出对象
+p *col.t # 此时使用col.t即可得到col的具体类型
+p col.t->size() # 可以按照派生类型去使用它,例如ColumnString我们可以调用size()
+......
+```
+
+注意:具有多态效果的是指针`COW::t`而非`IColumn`类对象,所以我们需要在GDB中将所有对`col`的使用替换为`col.t`才可以真正得到派生类型对象。


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org