You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Pouria (JIRA)" <ji...@apache.org> on 2016/01/15 20:42:39 UTC

[jira] [Created] (ASTERIXDB-1267) TPCH query failure with runtime error at scale

Pouria created ASTERIXDB-1267:
---------------------------------

             Summary: TPCH query failure with runtime error at scale
                 Key: ASTERIXDB-1267
                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1267
             Project: Apache AsterixDB
          Issue Type: Bug
            Reporter: Pouria


- TPC-H query 4 (AQL) fails with a internal/runtime error, only at scale. 
- This query works fine against an asterixdb instance with a simple configuration (it passes our build tests for example) - but it fails on cluster (I am using 9 NCs, 3 partitions (IO device) each).
- This query was working fine back in Summer 2015, and it seems a recent check-in broke it.

- Here is the query:

declare function tmp()
{
  for $l in dataset('LineItem')
  where $l.l_commitdate < $l.l_receiptdate
  distinct by $l.l_orderkey
  return { "o_orderkey": $l.l_orderkey }
}

for $o in dataset('Orders')
for $t in tmp()
where $o.o_orderkey = $t.o_orderkey and 
  $o.o_orderdate >= '1993-07-01' and $o.o_orderdate < '1993-10-01' 
group by $o_orderpriority := $o.o_orderpriority with $o
order by $o_orderpriority
return {
  "order_priority": $o_orderpriority,
  "count": count($o)
}

- Here is the optimized plan:

distribute result [%0->$$25]
-- DISTRIBUTE_RESULT  |PARTITIONED|
  exchange 
  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
    project ([$$25])
    -- STREAM_PROJECT  |PARTITIONED|
      assign [$$25] <- [function-call: asterix:closed-record-constructor, Args:[AString: {order_priority}, %0->$$3, AString: {count}, %0->$$35]]
      -- ASSIGN  |PARTITIONED|
        exchange 
        -- SORT_MERGE_EXCHANGE [$$3(ASC) ]  |PARTITIONED|
          group by ([$$3 := %0->$$39]) decor ([]) {
                    aggregate [$$35] <- [function-call: asterix:agg-sum, Args:[%0->$$38]]
                    -- AGGREGATE  |LOCAL|
                      nested tuple source
                      -- NESTED_TUPLE_SOURCE  |LOCAL|
                 }
          -- PRE_CLUSTERED_GROUP_BY[$$39]  |PARTITIONED|
            exchange 
            -- HASH_PARTITION_MERGE_EXCHANGE MERGE:[$$39(ASC)] HASH:[$$39]  |PARTITIONED|
              group by ([$$39 := %0->$$27]) decor ([]) {
                        aggregate [$$38] <- [function-call: asterix:agg-count, Args:[AInt64: {1}]]
                        -- AGGREGATE  |LOCAL|
                          nested tuple source
                          -- NESTED_TUPLE_SOURCE  |LOCAL|
                     }
              -- SORT_GROUP_BY[$$27]  |PARTITIONED|
                exchange 
                -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                  project ([$$27])
                  -- STREAM_PROJECT  |PARTITIONED|
                    exchange 
                    -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                      join (function-call: algebricks:eq, Args:[%0->$$30, %0->$$31])
                      -- HYBRID_HASH_JOIN [$$30][$$31]  |PARTITIONED|
                        exchange 
                        -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                          project ([$$27, $$30])
                          -- STREAM_PROJECT  |PARTITIONED|
                            select (function-call: algebricks:and, Args:[function-call: algebricks:ge, Args:[%0->$$29, AString: {1993-07-01}], function-call: algebricks:lt, Args:[%0->$$29, AString: {1993-10-01}]])
                            -- STREAM_SELECT  |PARTITIONED|
                              project ([$$27, $$29, $$30])
                              -- STREAM_PROJECT  |PARTITIONED|
                                assign [$$27, $$29] <- [function-call: asterix:field-access-by-index, Args:[%0->$$4, AInt32: {5}], function-call: asterix:field-access-by-index, Args:[%0->$$4, AInt32: {4}]]
                                -- ASSIGN  |PARTITIONED|
                                  exchange 
                                  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                    unnest-map [$$30, $$4] <- function-call: asterix:index-search, Args:[AString: {Orders}, AInt32: {0}, AString: {tpch_450g}, AString: {Orders}, ABoolean: {false}, ABoolean: {false}, ABoolean: {false}, AInt32: {1}, %0->$$43, AInt32: {1}, %0->$$43, TRUE, TRUE, TRUE]
                                    -- BTREE_SEARCH  |PARTITIONED|
                                      exchange 
                                      -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                        order (ASC, %0->$$43) 
                                        -- STABLE_SORT [$$43(ASC)]  |PARTITIONED|
                                          exchange 
                                          -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                            project ([$$43])
                                            -- STREAM_PROJECT  |PARTITIONED|
                                              exchange 
                                              -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                                unnest-map [$$42, $$43] <- function-call: asterix:index-search, Args:[AString: {orders_orderdateIx}, AInt32: {0}, AString: {tpch_450g}, AString: {Orders}, ABoolean: {false}, ABoolean: {false}, ABoolean: {false}, AInt32: {1}, %0->$$40, AInt32: {1}, %0->$$41, TRUE, FALSE, FALSE]
                                                -- BTREE_SEARCH  |PARTITIONED|
                                                  exchange 
                                                  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                                    assign [$$40, $$41] <- [AString: {1993-07-01}, AString: {1993-10-01}]
                                                    -- ASSIGN  |PARTITIONED|
                                                      empty-tuple-source
                                                      -- EMPTY_TUPLE_SOURCE  |PARTITIONED|
                        exchange 
                        -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                          distinct ([%0->$$31])
                          -- PRE_SORTED_DISTINCT_BY  |PARTITIONED|
                            exchange 
                            -- HASH_PARTITION_MERGE_EXCHANGE MERGE:[$$31(ASC)] HASH:[$$31]  |PARTITIONED|
                              project ([$$31])
                              -- STREAM_PROJECT  |PARTITIONED|
                                select (function-call: algebricks:lt, Args:[function-call: asterix:field-access-by-index, Args:[%0->$$5, AInt32: {11}], function-call: asterix:field-access-by-index, Args:[%0->$$5, AInt32: {12}]])
                                -- STREAM_SELECT  |PARTITIONED|
                                  project ([$$5, $$31])
                                  -- STREAM_PROJECT  |PARTITIONED|
                                    exchange 
                                    -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                      data-scan []<-[$$31, $$32, $$5] <- tpch_450g:LineItem
                                      -- DATASOURCE_SCAN  |PARTITIONED|
                                        exchange 
                                        -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                          empty-tuple-source
                                          -- EMPTY_TUPLE_SOURCE  |PARTITIONED|


- Here are error logs from two NCs:

from NC1:

INFO: Initializing TAID:TID:ANID:ODID:1:1:0:0 -> [org.apache.hyracks.dataflow.std.sort.ExternalSortOperatorDescriptor$2@6b5870a9, org.apache.hyracks.storage.am.btree.dataflow.BTreeSearchOperatorDescriptor@1306c1d9, org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode@a31e46f, org.apache.hyracks.dataflow.std.group.sort.SortGroupByOperatorDescriptor$1@619ff47e, Asterix { 
  assign [2, 3] := [org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@496a0ddc, org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@6c471521];
  stream-project [2, 3, 0];
  stream-select org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@69d0865a;
  stream-project [0, 2];
}, Asterix { 
  stream-project [0];
}]
Jan 15, 2016 10:35:07 AM org.apache.hyracks.control.nc.work.StartTasksWork run
INFO: Initializing TAID:TID:ANID:ODID:1:1:1:0 -> [org.apache.hyracks.dataflow.std.sort.ExternalSortOperatorDescriptor$2@6b5870a9, org.apache.hyracks.storage.am.btree.dataflow.BTreeSearchOperatorDescriptor@1306c1d9, org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode@a31e46f, org.apache.hyracks.dataflow.std.group.sort.SortGroupByOperatorDescriptor$1@619ff47e, Asterix { 
  assign [2, 3] := [org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@496a0ddc, org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@6c471521];
  stream-project [2, 3, 0];
  stream-select org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@69d0865a;
  stream-project [0, 2];
}, Asterix { 
  stream-project [0];
}]
Jan 15, 2016 10:35:07 AM org.apache.hyracks.control.nc.work.StartTasksWork run
INFO: Initializing TAID:TID:ANID:ODID:1:1:2:0 -> [org.apache.hyracks.dataflow.std.sort.ExternalSortOperatorDescriptor$2@6b5870a9, org.apache.hyracks.storage.am.btree.dataflow.BTreeSearchOperatorDescriptor@1306c1d9, org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode@a31e46f, org.apache.hyracks.dataflow.std.group.sort.SortGroupByOperatorDescriptor$1@619ff47e, Asterix { 
  assign [2, 3] := [org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@496a0ddc, org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@6c471521];
  stream-project [2, 3, 0];
  stream-select org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@69d0865a;
  stream-project [0, 2];
}, Asterix { 
  stream-project [0];
}]
org.apache.hyracks.api.exceptions.HyracksDataException: java.util.concurrent.ExecutionException: java.lang.NullPointerException
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:215)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83)
	at org.apache.hyracks.control.nc.Task.run(Task.java:258)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:213)
	... 5 more
Caused by: java.lang.NullPointerException
	at org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:183)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:204)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:201)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	... 3 more
org.apache.hyracks.api.exceptions.HyracksDataException: java.util.concurrent.ExecutionException: java.lang.NullPointerException
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:215)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83)
	at org.apache.hyracks.control.nc.Task.run(Task.java:258)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:213)
	... 5 more
Caused by: java.lang.NullPointerException
	at org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:183)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:204)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:201)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	... 3 more
org.apache.hyracks.api.exceptions.HyracksDataException: java.util.concurrent.ExecutionException: java.lang.NullPointerException
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:215)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83)
	at org.apache.hyracks.control.nc.Task.run(Task.java:258)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:213)
	... 5 more
Caused by: java.lang.NullPointerException
	at org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:183)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:204)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:201)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	... 3 more  



>From NC2:

org.apache.hyracks.api.exceptions.HyracksDataException: java.util.concurrent.ExecutionException: java.lang.NullPointerException
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:215)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83)
	at org.apache.hyracks.control.nc.Task.run(Task.java:258)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:213)
	... 5 more
Caused by: java.lang.NullPointerException
	at org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:183)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:204)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:201)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	... 3 more
org.apache.hyracks.api.exceptions.HyracksDataException: java.util.concurrent.ExecutionException: java.lang.NullPointerException
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:215)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83)
	at org.apache.hyracks.control.nc.Task.run(Task.java:258)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:213)
	... 5 more
Caused by: java.lang.NullPointerException
	at org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:183)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:204)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:201)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	... 3 more
Jan 15, 2016 10:35:08 AM org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
INFO: Executing: NotifyTaskFailure
org.apache.hyracks.api.exceptions.HyracksDataException: java.util.concurrent.ExecutionException: java.lang.NullPointerException
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:215)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83)
	at org.apache.hyracks.control.nc.Task.run(Task.java:258)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:213)
	... 5 more
Caused by: java.lang.NullPointerException
	at org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:183)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:204)
	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:201)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	... 3 more






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)