You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Weston Pace (Jira)" <ji...@apache.org> on 2021/01/22 16:53:00 UTC

[jira] [Commented] (ARROW-11348) [C++] Add pretty printing support for gdb

    [ https://issues.apache.org/jira/browse/ARROW-11348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270269#comment-17270269 ] 

Weston Pace commented on ARROW-11348:
-------------------------------------

gxx_linux-64I've made a first pass at this which improves things considerably.  I will keep improving upon this and adding new information / features as I debug and hopefully these scripts will be robust enough to merge at some point.  If anyone is interested in helping develop with these they are located here: [https://github.com/westonpace/arrow/tree/feature/gdb-pretty-printers]

To use the pretty printers you will need something like this in your .gdbinit

 
{code:java}
python
from pathlib import Pathdef load_file(gdb_dir, filename):
  fullpath = str(gdb_dir / filename)
  print(f'Activating pretty printer {fullpath}')
  gdb.execute(f'source {fullpath}')dir_ = Path('.').absolute()
while True:
  gdb_dir = dir_ / 'dev' / 'gdb'
  if gdb_dir.exists():
    print(f'Activating pretty printers found at {gdb_dir}')
    load_file(gdb_dir, 'find_stl.py')
    load_file(gdb_dir, 'pretty_printers.py')
    load_file(gdb_dir, 'commands.py')
    break
  if dir_ == Path('/'):
    print(f'Could not locate pretty printers')
    break
  dir_ = dir_.parent
end

{code}
This script will find the printers as long as you are in the arrow directory or a subdirectory when you run gdb.

 

 

There is also a utility to try and find the STL pretty printers.  These are found using conda so you will need to be in a conda environment with the gxx_linux-64 package installed to find them.

 

There is also a utility command `parr` which takes an "expression" and will attempt to use one of the arrow pretty print utilities to print the result of the expression.

 

Example commands:

 
{code:java}
p *by.data_
p (*(by.data())).child_data
p *((*(by.data())).child_data[0])
p (*((*(by.data())).child_data[0])).buffers
p *((*((*(by.data())).child_data[0])).buffers[1])
p *((*((*(by.data())).child_data[0])).buffers[2])
parr by
{code}
Output with pretty printers:

 

 
{code:java}
(gdb) $1 = ArrayData (type=DT("struct<a: string, b: int32>") length=8 offset=0 buffers=0x555555715f68 child_data=0x555555715f80)
(gdb) $2 = std::vector of length 2, capacity 2 = {std::shared_ptr<arrow::ArrayData> (use count 2, weak count 0) = {get() = 0x555555713ff0}, std::shared_ptr<arrow::ArrayData> (use count 2, weak count 0) = {
    get() = 0x555555714070}}
(gdb) $3 = ArrayData (type=DT("string") length=8 offset=0 buffers=0x555555714018 child_data=0x555555714030)
(gdb) $4 = std::vector of length 3, capacity 3 = {std::shared_ptr<arrow::Buffer> (empty) = {get() = 0x0}, std::shared_ptr<arrow::Buffer> (use count 1, weak count 0) = {get() = 0x5555556a5b00}, 
  std::shared_ptr<arrow::Buffer> (use count 1, weak count 0) = {get() = 0x5555556eee30}}
(gdb) $5 = Buffer (size=36 capacity=64 data_addr=0x7ffff4209400 "") = {x00, x00, x00, x00, x02, x00, x00, x00, x04, x00, x00, x00, x07, x00, x00, x00, x09, x00, x00, x00, x0c, x00, x00, x00, x0e, x00, x00, x00, x10, 
  x00, x00, x00, x13, x00, x00, x00}
(gdb) $6 = Buffer (size=19 capacity=64 data_addr=0x7ffff4209080 "exexwhyexwhyexexwhy") = {x65, x78, x65, x78, x77, x68, x79, x65, x78, x77, x68, x79, x65, x78, x65, x78, x77, x68, x79}
(gdb)   -- is_valid: all not null
  -- child 0 type: string
    [
      "ex",
      "ex",
      "why",
      "ex",
      "why",
      "ex",
      "ex",
      "why"
    ]
  -- child 1 type: int32
    [
      0,
      0,
      0,
      1,
      0,
      1,
      0,
      1
    ]
{code}
Output without pretty printers:

 

 
{code:java}
(gdb) $1 = (std::__shared_ptr_access<arrow::ArrayData, (__gnu_cxx::_Lock_policy)2, false, false>::element_type &) @0x555555715f10: {
  type = {<std::__shared_ptr<arrow::DataType, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<arrow::DataType, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 
    0x5555556eee70, _M_refcount = {_M_pi = 0x5555556eee60}}, <No data fields>}, length = 8, null_count = {<std::__atomic_base<long>> = {static _S_alignment = 8, _M_i = 0}, <No data fields>}, 
  offset = 0, buffers = {<std::_Vector_base<std::shared_ptr<arrow::Buffer>, std::allocator<std::shared_ptr<arrow::Buffer> > >> = {
      _M_impl = {<std::allocator<std::shared_ptr<arrow::Buffer> >> = {<__gnu_cxx::new_allocator<std::shared_ptr<arrow::Buffer> >> = {<No data fields>}, <No data fields>}, <std::_Vector_base<std::shared_ptr<arrow::Buffer>, std::allocator<std::shared_ptr<arrow::Buffer> > >::_Vector_impl_data> = {_M_start = 0x5555557150b0, _M_finish = 0x5555557150c0, 
          _M_end_of_storage = 0x5555557150c0}, <No data fields>}}, <No data fields>}, 
  child_data = {<std::_Vector_base<std::shared_ptr<arrow::ArrayData>, std::allocator<std::shared_ptr<arrow::ArrayData> > >> = {
      _M_impl = {<std::allocator<std::shared_ptr<arrow::ArrayData> >> = {<__gnu_cxx::new_allocator<std::shared_ptr<arrow::ArrayData> >> = {<No data fields>}, <No data fields>}, <std::_Vector_base<std::shared_ptr<arrow::ArrayData>, std::allocator<std::shared_ptr<arrow::ArrayData> > >::_Vector_impl_data> = {_M_start = 0x555555714580, _M_finish = 0x5555557145a0, 
          _M_end_of_storage = 0x5555557145a0}, <No data fields>}}, <No data fields>}, 
  dictionary = {<std::__shared_ptr<arrow::ArrayData, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<arrow::ArrayData, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, 
      _M_ptr = 0x0, _M_refcount = {_M_pi = 0x0}}, <No data fields>}}
(gdb) $2 = {<std::_Vector_base<std::shared_ptr<arrow::ArrayData>, std::allocator<std::shared_ptr<arrow::ArrayData> > >> = {
    _M_impl = {<std::allocator<std::shared_ptr<arrow::ArrayData> >> = {<__gnu_cxx::new_allocator<std::shared_ptr<arrow::ArrayData> >> = {<No data fields>}, <No data fields>}, <std::_Vector_base<std::shared_ptr<arrow::ArrayData>, std::allocator<std::shared_ptr<arrow::ArrayData> > >::_Vector_impl_data> = {_M_start = 0x555555714580, _M_finish = 0x5555557145a0, 
        _M_end_of_storage = 0x5555557145a0}, <No data fields>}}, <No data fields>}
(gdb) $3 = (std::__shared_ptr_access<arrow::ArrayData, (__gnu_cxx::_Lock_policy)2, false, false>::element_type &) @0x555555713fc0: {
  type = {<std::__shared_ptr<arrow::DataType, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<arrow::DataType, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, 
      _M_ptr = 0x5555556a20f0, _M_refcount = {_M_pi = 0x5555556a20e0}}, <No data fields>}, length = 8, null_count = {<std::__atomic_base<long>> = {static _S_alignment = 8, _M_i = 0}, <No data fields>}, 
  offset = 0, buffers = {<std::_Vector_base<std::shared_ptr<arrow::Buffer>, std::allocator<std::shared_ptr<arrow::Buffer> > >> = {
      _M_impl = {<std::allocator<std::shared_ptr<arrow::Buffer> >> = {<__gnu_cxx::new_allocator<std::shared_ptr<arrow::Buffer> >> = {<No data fields>}, <No data fields>}, <std::_Vector_base<std::shared_ptr<arrow::Buffer>, std::allocator<std::shared_ptr<arrow::Buffer> > >::_Vector_impl_data> = {_M_start = 0x555555713f70, _M_finish = 0x555555713fa0, 
          _M_end_of_storage = 0x555555713fa0}, <No data fields>}}, <No data fields>}, 
  child_data = {<std::_Vector_base<std::shared_ptr<arrow::ArrayData>, std::allocator<std::shared_ptr<arrow::ArrayData> > >> = {
      _M_impl = {<std::allocator<std::shared_ptr<arrow::ArrayData> >> = {<__gnu_cxx::new_allocator<std::shared_ptr<arrow::ArrayData> >> = {<No data fields>}, <No data fields>}, <std::_Vector_base<std::shared_ptr<arrow::ArrayData>, std::allocator<std::shared_ptr<arrow::ArrayData> > >::_Vector_impl_data> = {_M_start = 0x0, _M_finish = 0x0, _M_end_of_storage = 0x0}, <No data fields>}}, <No data fields>}, 
  dictionary = {<std::__shared_ptr<arrow::ArrayData, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<arrow::ArrayData, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, 
      _M_ptr = 0x0, _M_refcount = {_M_pi = 0x0}}, <No data fields>}}
(gdb) $4 = {<std::_Vector_base<std::shared_ptr<arrow::Buffer>, std::allocator<std::shared_ptr<arrow::Buffer> > >> = {
    _M_impl = {<std::allocator<std::shared_ptr<arrow::Buffer> >> = {<__gnu_cxx::new_allocator<std::shared_ptr<arrow::Buffer> >> = {<No data fields>}, <No data fields>}, <std::_Vector_base<std::shared_ptr<arrow::Buffer>, std::allocator<std::shared_ptr<arrow::Buffer> > >::_Vector_impl_data> = {_M_start = 0x555555713f70, _M_finish = 0x555555713fa0, 
        _M_end_of_storage = 0x555555713fa0}, <No data fields>}}, <No data fields>}
(gdb) $5 = (std::__shared_ptr_access<arrow::Buffer, (__gnu_cxx::_Lock_policy)2, false, false>::element_type &) @0x5555556a5ad0: {_vptr.Buffer = 0x7ffff751d450 <vtable for arrow::PoolBuffer+16>, 
  is_mutable_ = true, is_cpu_ = true, data_ = 0x7ffff4209400 "", mutable_data_ = 0x7ffff4209400 "", size_ = 36, capacity_ = 64, 
  parent_ = {<std::__shared_ptr<arrow::Buffer, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<arrow::Buffer, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 0x0, 
      _M_refcount = {_M_pi = 0x0}}, <No data fields>}, 
  memory_manager_ = {<std::__shared_ptr<arrow::MemoryManager, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<arrow::MemoryManager, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 0x555555714140, _M_refcount = {_M_pi = 0x5555556a5b30}}, <No data fields>}}
(gdb) $6 = (std::__shared_ptr_access<arrow::Buffer, (__gnu_cxx::_Lock_policy)2, false, false>::element_type &) @0x5555556eee00: {_vptr.Buffer = 0x7ffff751d450 <vtable for arrow::PoolBuffer+16>, 
  is_mutable_ = true, is_cpu_ = true, data_ = 0x7ffff4209080 "exexwhyexwhyexexwhy", mutable_data_ = 0x7ffff4209080 "exexwhyexwhyexexwhy", size_ = 19, capacity_ = 64, 
  parent_ = {<std::__shared_ptr<arrow::Buffer, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<arrow::Buffer, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 0x0, 
      _M_refcount = {_M_pi = 0x0}}, <No data fields>}, 
  memory_manager_ = {<std::__shared_ptr<arrow::MemoryManager, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<arrow::MemoryManager, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 0x5555557152c0, _M_refcount = {_M_pi = 0x5555557141c0}}, <No data fields>}}
(gdb) Undefined command: "parr".  Try "help".

{code}
 

 

> [C++] Add pretty printing support for gdb
> -----------------------------------------
>
>                 Key: ARROW-11348
>                 URL: https://issues.apache.org/jira/browse/ARROW-11348
>             Project: Apache Arrow
>          Issue Type: Wish
>            Reporter: Weston Pace
>            Priority: Major
>
> Parsing the GDB output is error prone and can take considerable time.  Also, some information is difficult or non-intuitive to get to (e.g. the name of a data type).  We should add GDB pretty printers[1] to improve the debug workflow for developers.  This could assist not just Arrow developers but also developers using the Arrow C++ libs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)