You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@quickstep.apache.org by Dylan Bacon <db...@wisc.edu> on 2017/11/16 00:19:04 UTC

Multiple Hash Join Table Errors

I'm working on getting the build functionality of two hash tables in a 
single physical plan node to work. The code for this can be found at my 
fork branch here 
(https://github.com/dylanpbacon/incubator-quickstep/tree/Generalized-Hash) 
or in the associated PR on the main tree. I've been running into a 
key.isPlausibleInstanceOf error when the BuildGeneralizedHash attempts 
to populate a hash table. The key types used are not plausible it seems, 
though everything seems to be in order.

The code can be found starting at ExecutionGenerator.cpp:1124 and 
following the classes and execution order for that function. Is there 
anything fundamental to how a hash table in a given query context or 
plan node works that would prohibit two being used at the same time or 
would cause them to conflict in some way? Both hash tables should fit 
within memory for this first implementation.

-- 
Regards,

Dylan Bacon
University of Wisconsin - Madison
Department of Computer Sciences
dbacon@wisc.edu


Re: Multiple Hash Join Table Errors

Posted by Dylan Bacon <db...@wisc.edu>.
Changing force_key_copy to TRUE did not fix the error. To the question 
that I missed from the previous email, as we're using a SimpleScalar 
HashTable here I think that the attributes are singles and not 
composite. They are standard int and character types, using the basic 
tests in the already-existing execution generator join test suite.

On 11/16/17 4:00 PM, Dylan Bacon wrote:
> As requested at the bottom of this message is the backtrace from lldb 
> regarding the error I'm seeing. I think there might be a race 
> condition with the two hash tables being constructed. There are times 
> that the execution will proceed past the build stage to where I'm 
> expecting it to fail, and sometimes where it fails on key validation. 
> Looking at the HashTable code I'm seeing there might be an issue with 
> me using a defined type with force_key_copy set to FALSE. The work 
> orders might be swapping between constructing the two hash tables so 
> would setting this to TRUE fix the race condition? I'm thinking it 
> might if I'm interpreting it correctly.
>
> * thread #2, stop reason = signal SIGABRT
>   * frame #0: 0x00007fffc1dccd42 libsystem_kernel.dylib`__pthread_kill 
> + 10
>     frame #1: 0x00007fffc1eba457 libsystem_pthread.dylib`pthread_kill 
> + 90
>     frame #2: 0x00007fffc1d32420 libsystem_c.dylib`abort + 129
>     frame #3: 0x00007fffc1cf9893 libsystem_c.dylib`__assert_rtn + 320
>     frame #4: 0x000000010067cc2c 
> quickstep_queryoptimizer_tests_ExecutionGeneratorTest`quickstep::SimpleScalarSeparateChainingHashTable<quickstep::TupleReference, 
> true, false, false, true>::putInternal(this=0x000000011e302e10, 
> key=0x000070000055ee30, variable_key_size=0, value=0x000070000055ee20, 
> prealloc_state=0x000070000055f018) at 
> SimpleScalarSeparateChainingHashTable.hpp:629
>     frame #5: 0x00000001002d5fdd 
> quickstep_queryoptimizer_tests_ExecutionGeneratorTest`quickstep::HashTablePutResult 
> quickstep::HashTablePutResult 
> quickstep::HashTable<quickstep::TupleReference, true, false, false, 
> true>::putValueAccessor<quickstep::(anonymous 
> namespace)::TupleReferenceGenerator>(this=0x000070000055efd0, 
> accessor=0x000000011d703950)::TupleReferenceGenerator*)::'lambda'(quickstep::(anonymous 
> namespace)::TupleReferenceGenerator*)::operator()<quickstep::SplitRowStoreValueAccessor>(quickstep::(anonymous 
> namespace)::TupleReferenceGenerator*) const at HashTable.hpp:1424
>     frame #6: 0x00000001002d0472 
> quickstep_queryoptimizer_tests_ExecutionGeneratorTest`auto 
> quickstep::InvokeOnValueAccessorNotAdapter<quickstep::HashTablePutResult 
> quickstep::HashTable<quickstep::TupleReference, true, false, false, 
> true>::putValueAccessor<quickstep::(anonymous 
> namespace)::TupleReferenceGenerator>(quickstep::ValueAccessor*, int, 
> bool, quickstep::(anonymous 
> namespace)::TupleReferenceGenerator*)::'lambda'(quickstep::(anonymous 
> namespace)::TupleReferenceGenerator*)>(accessor=0x000000011d703950, 
> functor=0x000070000055efd0)::TupleReferenceGenerator const&) at 
> ValueAccessorUtil.hpp:73
>     frame #7: 0x00000001002cfb43 
> quickstep_queryoptimizer_tests_ExecutionGeneratorTest`auto 
> quickstep::InvokeOnAnyValueAccessor<quickstep::HashTablePutResult 
> quickstep::HashTable<quickstep::TupleReference, true, false, false, 
> true>::putValueAccessor<quickstep::(anonymous 
> namespace)::TupleReferenceGenerator>(quickstep::ValueAccessor*, int, 
> bool, quickstep::(anonymous 
> namespace)::TupleReferenceGenerator*)::'lambda'(quickstep::(anonymous 
> namespace)::TupleReferenceGenerator*)>(accessor=0x000000011d703950, 
> functor=0x000070000055efd0)::TupleReferenceGenerator const&) at 
> ValueAccessorUtil.hpp:256
>     frame #8: 0x00000001002cf54e 
> quickstep_queryoptimizer_tests_ExecutionGeneratorTest`quickstep::HashTablePutResult 
> quickstep::HashTable<quickstep::TupleReference, true, false, false, 
> true>::putValueAccessor<quickstep::(anonymous 
> namespace)::TupleReferenceGenerator>(this=0x000000011e302e10, 
> accessor=0x000000011d703950, key_attr_id=0, check_for_null_keys=false, 
> functor=0x000070000055f1b0)::TupleReferenceGenerator*) at 
> HashTable.hpp:1373
>     frame #9: 0x00000001002ceecc 
> quickstep_queryoptimizer_tests_ExecutionGeneratorTest`quickstep::BuildGeneralizedHashWorkOrder::execute(this=0x000000011d703880) 
> at BuildGeneralizedHashOperator.cpp:283
>     frame #10: 0x00000001000c0c55 
> quickstep_queryoptimizer_tests_ExecutionGeneratorTest`quickstep::Worker::executeWorkOrderHelper(this=0x000000011e0025d0, 
> tagged_message=0x000070000055fc50, proto=0x000070000055fbc0, 
> is_rebuild_work_order=false) at Worker.cpp:137
>     frame #11: 0x00000001000c066d 
> quickstep_queryoptimizer_tests_ExecutionGeneratorTest`quickstep::Worker::run(this=0x000000011e0025d0) 
> at Worker.cpp:78
>     frame #12: 0x000000010023c0a4 
> quickstep_queryoptimizer_tests_ExecutionGeneratorTest`quickstep::threading_internal::executeRunMethodForThreadReturnNothing(thread_ptr=0x000000011e0025d0) 
> at Thread.cpp:30
>     frame #13: 0x000000010000f15d 
> quickstep_queryoptimizer_tests_ExecutionGeneratorTest`void* 
> std::__1::__thread_proxy<std::__1::tuple<void (*)(void*), 
> quickstep::ThreadImplCPP11*> >(void*) [inlined] 
> decltype(__f=0x000000011e0039f0, 
> __args=0x000000011e0039f8)(void*)>(fp)(std::__1::forward<quickstep::ThreadImplCPP11*>(fp0))) 
> std::__1::__invoke<void (*)(void*), quickstep::ThreadImplCPP11*>(void 
> (*&&)(void*), quickstep::ThreadImplCPP11*&&) at __functional_base:416
>     frame #14: 0x000000010000f145 
> quickstep_queryoptimizer_tests_ExecutionGeneratorTest`void* 
> std::__1::__thread_proxy<std::__1::tuple<void (*)(void*), 
> quickstep::ThreadImplCPP11*> >(void*) [inlined] void 
> std::__1::__thread_execute<void (*)(void*), 
> quickstep::ThreadImplCPP11*, 1ul>(__t=0x000000011e0039f0)(void*), 
> quickstep::ThreadImplCPP11*>&, std::__1::__tuple_indices<1ul>) at 
> thread:347
>     frame #15: 0x000000010000f11d 
> quickstep_queryoptimizer_tests_ExecutionGeneratorTest`void* 
> std::__1::__thread_proxy<std::__1::tuple<void (*)(void*), 
> quickstep::ThreadImplCPP11*> >(__vp=0x000000011e0039f0) at thread:357
>     frame #16: 0x00007fffc1eb793b 
> libsystem_pthread.dylib`_pthread_body + 180
>     frame #17: 0x00007fffc1eb7887 
> libsystem_pthread.dylib`_pthread_start + 286
>     frame #18: 0x00007fffc1eb708d libsystem_pthread.dylib`thread_start 
> + 13
>
>
> On 11/15/17 6:19 PM, Dylan Bacon wrote:
>> I'm working on getting the build functionality of two hash tables in 
>> a single physical plan node to work. The code for this can be found 
>> at my fork branch here 
>> (https://github.com/dylanpbacon/incubator-quickstep/tree/Generalized-Hash) 
>> or in the associated PR on the main tree. I've been running into a 
>> key.isPlausibleInstanceOf error when the BuildGeneralizedHash 
>> attempts to populate a hash table. The key types used are not 
>> plausible it seems, though everything seems to be in order.
>>
>> The code can be found starting at ExecutionGenerator.cpp:1124 and 
>> following the classes and execution order for that function. Is there 
>> anything fundamental to how a hash table in a given query context or 
>> plan node works that would prohibit two being used at the same time 
>> or would cause them to conflict in some way? Both hash tables should 
>> fit within memory for this first implementation.
>>
>


Re: Multiple Hash Join Table Errors

Posted by Dylan Bacon <db...@wisc.edu>.
As requested at the bottom of this message is the backtrace from lldb 
regarding the error I'm seeing. I think there might be a race condition 
with the two hash tables being constructed. There are times that the 
execution will proceed past the build stage to where I'm expecting it to 
fail, and sometimes where it fails on key validation. Looking at the 
HashTable code I'm seeing there might be an issue with me using a 
defined type with force_key_copy set to FALSE. The work orders might be 
swapping between constructing the two hash tables so would setting this 
to TRUE fix the race condition? I'm thinking it might if I'm 
interpreting it correctly.

* thread #2, stop reason = signal SIGABRT
   * frame #0: 0x00007fffc1dccd42 libsystem_kernel.dylib`__pthread_kill + 10
     frame #1: 0x00007fffc1eba457 libsystem_pthread.dylib`pthread_kill + 90
     frame #2: 0x00007fffc1d32420 libsystem_c.dylib`abort + 129
     frame #3: 0x00007fffc1cf9893 libsystem_c.dylib`__assert_rtn + 320
     frame #4: 0x000000010067cc2c 
quickstep_queryoptimizer_tests_ExecutionGeneratorTest`quickstep::SimpleScalarSeparateChainingHashTable<quickstep::TupleReference, 
true, false, false, true>::putInternal(this=0x000000011e302e10, 
key=0x000070000055ee30, variable_key_size=0, value=0x000070000055ee20, 
prealloc_state=0x000070000055f018) at 
SimpleScalarSeparateChainingHashTable.hpp:629
     frame #5: 0x00000001002d5fdd 
quickstep_queryoptimizer_tests_ExecutionGeneratorTest`quickstep::HashTablePutResult 
quickstep::HashTablePutResult 
quickstep::HashTable<quickstep::TupleReference, true, false, false, 
true>::putValueAccessor<quickstep::(anonymous 
namespace)::TupleReferenceGenerator>(this=0x000070000055efd0, 
accessor=0x000000011d703950)::TupleReferenceGenerator*)::'lambda'(quickstep::(anonymous 
namespace)::TupleReferenceGenerator*)::operator()<quickstep::SplitRowStoreValueAccessor>(quickstep::(anonymous 
namespace)::TupleReferenceGenerator*) const at HashTable.hpp:1424
     frame #6: 0x00000001002d0472 
quickstep_queryoptimizer_tests_ExecutionGeneratorTest`auto 
quickstep::InvokeOnValueAccessorNotAdapter<quickstep::HashTablePutResult 
quickstep::HashTable<quickstep::TupleReference, true, false, false, 
true>::putValueAccessor<quickstep::(anonymous 
namespace)::TupleReferenceGenerator>(quickstep::ValueAccessor*, int, 
bool, quickstep::(anonymous 
namespace)::TupleReferenceGenerator*)::'lambda'(quickstep::(anonymous 
namespace)::TupleReferenceGenerator*)>(accessor=0x000000011d703950, 
functor=0x000070000055efd0)::TupleReferenceGenerator const&) at 
ValueAccessorUtil.hpp:73
     frame #7: 0x00000001002cfb43 
quickstep_queryoptimizer_tests_ExecutionGeneratorTest`auto 
quickstep::InvokeOnAnyValueAccessor<quickstep::HashTablePutResult 
quickstep::HashTable<quickstep::TupleReference, true, false, false, 
true>::putValueAccessor<quickstep::(anonymous 
namespace)::TupleReferenceGenerator>(quickstep::ValueAccessor*, int, 
bool, quickstep::(anonymous 
namespace)::TupleReferenceGenerator*)::'lambda'(quickstep::(anonymous 
namespace)::TupleReferenceGenerator*)>(accessor=0x000000011d703950, 
functor=0x000070000055efd0)::TupleReferenceGenerator const&) at 
ValueAccessorUtil.hpp:256
     frame #8: 0x00000001002cf54e 
quickstep_queryoptimizer_tests_ExecutionGeneratorTest`quickstep::HashTablePutResult 
quickstep::HashTable<quickstep::TupleReference, true, false, false, 
true>::putValueAccessor<quickstep::(anonymous 
namespace)::TupleReferenceGenerator>(this=0x000000011e302e10, 
accessor=0x000000011d703950, key_attr_id=0, check_for_null_keys=false, 
functor=0x000070000055f1b0)::TupleReferenceGenerator*) at HashTable.hpp:1373
     frame #9: 0x00000001002ceecc 
quickstep_queryoptimizer_tests_ExecutionGeneratorTest`quickstep::BuildGeneralizedHashWorkOrder::execute(this=0x000000011d703880) 
at BuildGeneralizedHashOperator.cpp:283
     frame #10: 0x00000001000c0c55 
quickstep_queryoptimizer_tests_ExecutionGeneratorTest`quickstep::Worker::executeWorkOrderHelper(this=0x000000011e0025d0, 
tagged_message=0x000070000055fc50, proto=0x000070000055fbc0, 
is_rebuild_work_order=false) at Worker.cpp:137
     frame #11: 0x00000001000c066d 
quickstep_queryoptimizer_tests_ExecutionGeneratorTest`quickstep::Worker::run(this=0x000000011e0025d0) 
at Worker.cpp:78
     frame #12: 0x000000010023c0a4 
quickstep_queryoptimizer_tests_ExecutionGeneratorTest`quickstep::threading_internal::executeRunMethodForThreadReturnNothing(thread_ptr=0x000000011e0025d0) 
at Thread.cpp:30
     frame #13: 0x000000010000f15d 
quickstep_queryoptimizer_tests_ExecutionGeneratorTest`void* 
std::__1::__thread_proxy<std::__1::tuple<void (*)(void*), 
quickstep::ThreadImplCPP11*> >(void*) [inlined] 
decltype(__f=0x000000011e0039f0, 
__args=0x000000011e0039f8)(void*)>(fp)(std::__1::forward<quickstep::ThreadImplCPP11*>(fp0))) 
std::__1::__invoke<void (*)(void*), quickstep::ThreadImplCPP11*>(void 
(*&&)(void*), quickstep::ThreadImplCPP11*&&) at __functional_base:416
     frame #14: 0x000000010000f145 
quickstep_queryoptimizer_tests_ExecutionGeneratorTest`void* 
std::__1::__thread_proxy<std::__1::tuple<void (*)(void*), 
quickstep::ThreadImplCPP11*> >(void*) [inlined] void 
std::__1::__thread_execute<void (*)(void*), quickstep::ThreadImplCPP11*, 
1ul>(__t=0x000000011e0039f0)(void*), quickstep::ThreadImplCPP11*>&, 
std::__1::__tuple_indices<1ul>) at thread:347
     frame #15: 0x000000010000f11d 
quickstep_queryoptimizer_tests_ExecutionGeneratorTest`void* 
std::__1::__thread_proxy<std::__1::tuple<void (*)(void*), 
quickstep::ThreadImplCPP11*> >(__vp=0x000000011e0039f0) at thread:357
     frame #16: 0x00007fffc1eb793b libsystem_pthread.dylib`_pthread_body 
+ 180
     frame #17: 0x00007fffc1eb7887 
libsystem_pthread.dylib`_pthread_start + 286
     frame #18: 0x00007fffc1eb708d libsystem_pthread.dylib`thread_start + 13


On 11/15/17 6:19 PM, Dylan Bacon wrote:
> I'm working on getting the build functionality of two hash tables in a 
> single physical plan node to work. The code for this can be found at 
> my fork branch here 
> (https://github.com/dylanpbacon/incubator-quickstep/tree/Generalized-Hash) 
> or in the associated PR on the main tree. I've been running into a 
> key.isPlausibleInstanceOf error when the BuildGeneralizedHash attempts 
> to populate a hash table. The key types used are not plausible it 
> seems, though everything seems to be in order.
>
> The code can be found starting at ExecutionGenerator.cpp:1124 and 
> following the classes and execution order for that function. Is there 
> anything fundamental to how a hash table in a given query context or 
> plan node works that would prohibit two being used at the same time or 
> would cause them to conflict in some way? Both hash tables should fit 
> within memory for this first implementation.
>


Re: Multiple Hash Join Table Errors

Posted by Harshad Deshmukh <ha...@cs.wisc.edu>.
Hi Dylan,


Is the key in the hash table a composite attribute? If so, what are the data types of the attributes?


It will be helpful if you can post a trace of your execution using any debugger.


Thanks,

Harshad

________________________________
From: Dylan Bacon <db...@wisc.edu>
Sent: Wednesday, November 15, 2017 6:19:04 PM
To: dev@quickstep.incubator.apache.org
Subject: Multiple Hash Join Table Errors

I'm working on getting the build functionality of two hash tables in a
single physical plan node to work. The code for this can be found at my
fork branch here
(https://github.com/dylanpbacon/incubator-quickstep/tree/Generalized-Hash)
or in the associated PR on the main tree. I've been running into a
key.isPlausibleInstanceOf error when the BuildGeneralizedHash attempts
to populate a hash table. The key types used are not plausible it seems,
though everything seems to be in order.

The code can be found starting at ExecutionGenerator.cpp:1124 and
following the classes and execution order for that function. Is there
anything fundamental to how a hash table in a given query context or
plan node works that would prohibit two being used at the same time or
would cause them to conflict in some way? Both hash tables should fit
within memory for this first implementation.

--
Regards,

Dylan Bacon
University of Wisconsin - Madison
Department of Computer Sciences
dbacon@wisc.edu