You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Henry Robinson (JIRA)" <ji...@apache.org> on 2017/05/20 20:41:04 UTC

[jira] [Resolved] (IMPALA-5138) Running 32 concurrent queries from TPC-DS Q31 caused a crash in "impala::BufferedTupleStream::CopyStrings (this=0x7f182c9b4440, tuple=0x7f15aa008000, string_slots=...) buffered-tuple-stream.cc:840"

     [ https://issues.apache.org/jira/browse/IMPALA-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henry Robinson resolved IMPALA-5138.
------------------------------------
       Resolution: Duplicate
    Fix Version/s: Impala 2.10.0

Seems to be the same root cause as IMPALA-5093.

> Running 32 concurrent queries from TPC-DS Q31 caused a crash in "impala::BufferedTupleStream::CopyStrings (this=0x7f182c9b4440, tuple=0x7f15aa008000, string_slots=...)   buffered-tuple-stream.cc:840"
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-5138
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5138
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Mostafa Mokhtar
>            Assignee: Henry Robinson
>             Fix For: Impala 2.10.0
>
>
> 32 concurrent queries from TPC-DS Q31 against a 138 node cluster caused a crash on the coordinator node 
> {code}
> (gdb) bt
> #0  0x00000037dce32625 in raise () from /lib64/libc.so.6
> #1  0x00000037dce33e05 in abort () from /lib64/libc.so.6
> #2  0x00007f1b61179a55 in os::abort(bool) () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
> #3  0x00007f1b612f9f87 in VMError::report_and_die() () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
> #4  0x00007f1b6117e96f in JVM_handle_linux_signal () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
> #5  <signal handler called>
> #6  0x00000037dce89a97 in memcpy () from /lib64/libc.so.6
> #7  0x0000000000f4f68c in impala::BufferedTupleStream::CopyStrings (this=0x7f182c9b4440, tuple=0x7f15aa008000, string_slots=...)
>     at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/buffered-tuple-stream.cc:840
> #8  0x0000000000f4fd75 in DeepCopyInternal<false> (this=0x7f182c9b4440, row=0x7f17d7ec4b08)
>     at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/buffered-tuple-stream.cc:815
> #9  impala::BufferedTupleStream::DeepCopy (this=0x7f182c9b4440, row=0x7f17d7ec4b08)
>     at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/buffered-tuple-stream.cc:752
> #10 0x0000000000e751af in AddRow (this=0x7f171fcdc1c0, stream=0x7f182c9b4440, row=0x7f17d7ec4b08, status=0x7f1521a9e780)
>     at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/runtime/buffered-tuple-stream.inline.h:30
> #11 impala::PhjBuilder::AppendRowStreamFull (this=0x7f171fcdc1c0, stream=0x7f182c9b4440, row=0x7f17d7ec4b08, status=0x7f1521a9e780)
>     at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/exec/partitioned-hash-join-builder.cc:281
> #12 0x00007f18e5422316 in impala::PhjBuilder::ProcessBuildBatch ()
> #13 0x0000000000e76351 in impala::PhjBuilder::Send (this=0x7f171fcdc1c0, state=Unhandled dwarf expression opcode 0xf3
> )
>     at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/exec/partitioned-hash-join-builder.cc:174
> #14 0x0000000000e5f673 in impala::BlockingJoinNode::SendBuildInputToSink<true> (this=0x7f1580c44700, state=0x7f159c82f100, build_sink=
>     0x7f171fcdc1c0) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/exec/blocking-join-node.cc:287
> #15 0x0000000000e5def7 in impala::BlockingJoinNode::ProcessBuildInputAsync (this=0x7f1580c44700, state=0x7f159c82f100, build_sink=0x7f171fcdc1c0,
>     status=0x7f1524ba0b50) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/exec/blocking-join-node.cc:154
> ---Type <return> to continue, or q <return> to quit---
> #16 0x0000000000d59509 in operator() (name=Unhandled dwarf expression opcode 0xf3
> )
>     at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0-p1/include/boost/function/function_template.hpp:767
> #17 impala::Thread::SuperviseThread (name=Unhandled dwarf expression opcode 0xf3
> ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/util/thread.cc:325
> #18 0x0000000000d59f54 in operator()<void (*)(const std::basic_string<char>&, const std::basic_string<char>&, boost::function<void()>, impala::Promise<long int>*), boost::_bi::list0> (this=0x7f15aa6f3a00)
>     at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0-p1/include/boost/bind/bind.hpp:457
> #19 operator() (this=0x7f15aa6f3a00)
>     at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0-p1/include/boost/bind/bind_template.hpp:20
> #20 boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, boost::function<void()>, impala::Promise<long int>*), boost::_bi::list4<boost::_bi::value<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<boost::function<void()> >, boost::_bi::value<impala::Promise<long int>*> > > >::run(void)
>     (this=0x7f15aa6f3a00)
>     at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0-p1/include/boost/thread/detail/thread.hpp:116
> #21 0x0000000000fdf9fa in thread_proxy ()
> #22 0x00000037dd207aa1 in start_thread () from /lib64/libpthread.so.0
> #23 0x00000037dcee893d in clone () from /lib64/libc.so.6
> {code}
> To repro run this command 
> {code}
> bash ./run.sh 32 "use tpcds_30000_parquet; with ss as (select ca_county,d_qoy, d_year,sum(ss_ext_sales_price) as store_sales from store_sales,date_dim,customer_address where ss_sold_date_sk = d_date_sk and ss_addr_sk=ca_address_sk group by ca_county,d_qoy, d_year), ws as (select ca_county,d_qoy, d_year,sum(ws_ext_sales_price) as web_sales from web_sales,date_dim,customer_address where ws_sold_date_sk = d_date_sk and ws_bill_addr_sk=ca_address_sk group by ca_county,d_qoy, d_year) select ss1.ca_county ,ss1.d_year ,ws2.web_sales/ws1.web_sales web_q1_q2_increase ,ss2.store_sales/ss1.store_sales store_q1_q2_increase ,ws3.web_sales/ws2.web_sales web_q2_q3_increase ,ss3.store_sales/ss2.store_sales store_q2_q3_increase from ss ss1 ,ss ss2 ,ss ss3 ,ws ws1 ,ws ws2 ,ws ws3 where ss1.d_qoy = 1 and ss1.d_year = 2002 and ss1.ca_county = ss2.ca_county and ss2.d_qoy = 2 and ss2.d_year = 2002 and ss2.ca_county = ss3.ca_county and ss3.d_qoy = 3 and ss3.d_year = 2002 and ss1.ca_county = ws1.ca_county and ws1.d_qoy = 1 and ws1.d_year = 2002 and ws1.ca_county = ws2.ca_county and ws2.d_qoy = 2 and ws2.d_year = 2002 and ws1.ca_county = ws3.ca_county and ws3.d_qoy = 3 and ws3.d_year =2002 and case when ws1.web_sales > 0 then ws2.web_sales/ws1.web_sales else null end > case when ss1.store_sales > 0 then ss2.store_sales/ss1.store_sales else null end and case when ws2.web_sales > 0 then ws3.web_sales/ws2.web_sales else null end > case when ss2.store_sales > 0 then ss3.store_sales/ss2.store_sales else null end order by ss1.d_year;" query_31_32_users
> {code}
> The script used
> {code}
> [mmokhtar@va1026 ~]$ cat ./run.sh
> #!/usr/bin/bash
> loop="$1"
> query="$2"
> outputfile="$3"
> startT=$(($(date +%s%N)/1000000))
> for i in $(seq 1 1 $loop); do
>   echo "$(date "+%m-%d-%Y %T") : Start ${outputfile}.$i.txt"
>   impala-shell -k -q "$2" > ${outputfile}.$i.txt 2>&1  &
>   echo "$(date "+%m-%d-%Y %T") : End ${outputfile}.$i.txt"
> done
> wait
> endT=$(($(date +%s%N)/1000000))
> queryElapesedTime=$((endT-startT))
> echo "$(date "+%m-%d-%Y %T") : All done ${outputfile}.$i.txt finished in $queryElapesedTime"
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)