You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@madlib.apache.org by James Gregory <ja...@gmail.com> on 2017/10/24 14:53:39 UTC

Seg fault when using postgis and madlib at the same time

When running queries that make use of both madlib cosine_similarity
and postgis ST_Intersects, with often get segmentation faults. It
doesn't happen 100% of the time - perhaps it needs multiple queries
running in parallel to make the segfault happen, or it might be some
other random thing that triggers it.

The stack trace is:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000056464a888e64 in pfree ()
(gdb)
(gdb) bt
#0  0x000056464a888e64 in pfree ()
#1  0x00007f3ddf33ae4d in
madlib::dbconnector::postgres::Allocator::free<(madlib::dbal::MemoryContext)0>
(inPtr=inPtr@entry=0x56464b701ae0, this=<optimized out>)
    at /tmp/tmpbs9UjC/madlib-1.11.0/src/ports/postgres/dbconnector/Allocator_impl.hpp:189
#2  0x00007f3ddf33b16d in operator delete (ptr=0x56464b701ae0)
    at /tmp/tmpbs9UjC/madlib-1.11.0/src/ports/postgres/dbconnector/NewDelete.cpp:62
#3  0x00007f3ddf04c718 in deallocate (this=0x56464b6fa828, __p=<optimized out>)
    at /usr/include/c++/4.9/ext/new_allocator.h:110
#4  deallocate (__a=..., __n=1, __p=<optimized out>) at
/usr/include/c++/4.9/ext/alloc_traits.h:185
#5  _M_put_node (this=0x56464b6fa828, __p=<optimized out>) at
/usr/include/c++/4.9/bits/stl_tree.h:389
#6  _M_destroy_node (this=0x56464b6fa828, __p=<optimized out>)
    at /usr/include/c++/4.9/bits/stl_tree.h:410
#7  std::_Rb_tree<std::string, std::string,
std::_Identity<std::string>, std::less<std::string>,
std::allocator<std::string> >::_M_erase
(this=this@entry=0x56464b6fa828, __x=<optimized out>)
    at /usr/include/c++/4.9/bits/stl_tree.h:1247
#8  0x00007f3ddf04c6f4 in std::_Rb_tree<std::string, std::string,
std::_Identity<std::string>, std::less<std::string>,
std::allocator<std::string> >::_M_erase (this=0x56464b6fa828,
__x=0x56464b701a80)
    at /usr/include/c++/4.9/bits/stl_tree.h:1245
#9  0x00007f3ddc714a99 in osgDB::Registry::~Registry() ()
   from /usr/lib/x86_64-linux-gnu/libosgDB.so.100
#10 0x00007f3ddc714d99 in osgDB::Registry::~Registry() ()
   from /usr/lib/x86_64-linux-gnu/libosgDB.so.100
#11 0x00007f3e27b6ab29 in __run_exit_handlers (status=0,
listp=0x7f3e27ed85a8 <__exit_funcs>,
    run_list_atexit=run_list_atexit@entry=true) at exit.c:82
#12 0x00007f3e27b6ab75 in __GI_exit (status=<optimized out>) at exit.c:104
#13 0x000056464a748504 in proc_exit ()
#14 0x000056464a768c63 in PostgresMain ()
#15 0x000056464a501001 in ?? ()
#16 0x000056464a70c9d1 in PostmasterMain ()
#17 0x000056464a502187 in main ()

osgDB::Registry is part of the open scene graph library
(libosgDB.so.100 in the stack trace), which is used by postgis due to
its dependency on SFCGAL.

I can see in madlib's NewDelete.cpp the comment:

* We override the C++ global memory allocation and deallocation functions. We
* map them to ultimately use the PostgreSQL memory routines to protect against
* memory leaks.

I guess somehow the memory management in open scene graph interacts
badly with these overrides?

A colleague is looking into using a more recent version of postgis,
which may make the problem go away, though we are already using madlib
1.11 and postgis 2.3.3+dfsg-1.pgdg80+1, which are pretty recent. It's
also possible that the conflict only happens when using the
Debian/Ubuntu postgis binaries, perhaps installing postgis using pgxn
or even compiling from source would resolve the issue.

Still, it seems a bit strange that a destructor in open scene graph is
being caused to segfault by a custom override of memory deallocation
in madlib?

--
James