You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Joe McDonnell (JIRA)" <ji...@apache.org> on 2019/02/22 17:29:00 UTC

[jira] [Resolved] (IMPALA-8178) Tests failing with “Could not allocate memory while trying to increase reservation” on EC filesystem

     [ https://issues.apache.org/jira/browse/IMPALA-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joe McDonnell resolved IMPALA-8178.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 3.2.0

> Tests failing with “Could not allocate memory while trying to increase reservation” on EC filesystem
> ----------------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-8178
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8178
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 3.2.0
>            Reporter: Andrew Sherman
>            Assignee: Joe McDonnell
>            Priority: Blocker
>              Labels: broken-build
>             Fix For: Impala 3.2.0
>
>
> In tests run against an Erasure Coding filesystem, multiple tests failed with memory allocation errors.
> In total 10 tests failed:
>  * query_test.test_scanners.TestParquet.test_decimal_encodings
>  * query_test.test_scanners.TestTpchScanRangeLengths.test_tpch_scan_ranges
>  * query_test.test_exprs.TestExprs.test_exprs [enable_expr_rewrites: 0]
>  * query_test.test_exprs.TestExprs.test_exprs [enable_expr_rewrites: 1]
>  * query_test.test_hbase_queries.TestHBaseQueries.test_hbase_scan_node
>  * query_test.test_scanners.TestParquet.test_def_levels
>  * query_test.test_scanners.TestTextSplitDelimiters.test_text_split_across_buffers_delimiterquery_test.test_hbase_queries.TestHBaseQueries.test_hbase_filters
>  * query_test.test_hbase_queries.TestHBaseQueries.test_hbase_inline_views
>  * query_test.test_hbase_queries.TestHBaseQueries.test_hbase_top_n
> The first failure looked like this on the client side:
> {quote}
> F query_test/test_scanners.py::TestParquet::()::test_decimal_encodings[protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 'abort_on_error': 1, 'debug_action': '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@0.5', 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]
>  query_test/test_scanners.py:717: in test_decimal_encodings
>      self.run_test_case('QueryTest/parquet-decimal-formats', vector, unique_database)
>  common/impala_test_suite.py:472: in run_test_case
>      result = self.__execute_query(target_impalad_client, query, user=user)
>  common/impala_test_suite.py:699: in __execute_query
>      return impalad_client.execute(query, user=user)
>  common/impala_connection.py:174: in execute
>      return self.__beeswax_client.execute(sql_stmt, user=user)
>  beeswax/impala_beeswax.py:183: in execute
>      handle = self.__execute_query(query_string.strip(), user=user)
>  beeswax/impala_beeswax.py:360: in __execute_query
>      self.wait_for_finished(handle)
>  beeswax/impala_beeswax.py:381: in wait_for_finished
>      raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
>  E   ImpalaBeeswaxException: ImpalaBeeswaxException:
>  E    Query aborted:ExecQueryFInstances rpc query_id=6e44c3c949a31be2:f973c7ff00000000 failed: Failed to get minimum memory reservation of 8.00 KB on daemon xxx.com:22001 for query 6e44c3c949a31be2:f973c7ff00000000 due to following error: Memory limit exceeded: Could not allocate memory while trying to increase reservation.
>  E   Query(6e44c3c949a31be2:f973c7ff00000000) could not allocate 8.00 KB without exceeding limit.
>  E   Error occurred on backend xxx.com:22001
>  E   Memory left in process limit: 1.19 GB
>  E   Query(6e44c3c949a31be2:f973c7ff00000000): Reservation=0 ReservationLimit=9.60 GB OtherMemory=0 Total=0 Peak=0
>  E   Memory is likely oversubscribed. Reducing query concurrency or configuring admission control may help avoid this error.
> {quote}
> On the server side log:
> {quote}
> I0207 18:25:19.329311  5562 impala-server.cc:1063] 6e44c3c949a31be2:f973c7ff00000000] Registered query query_id=6e44c3c949a31be2:f973c7ff00000000 session_id=93497065f69e9d01:8a3bd06faff3da5
> I0207 18:25:19.329434  5562 Frontend.java:1242] 6e44c3c949a31be2:f973c7ff00000000] Analyzing query: select score from decimal_stored_as_int32
> I0207 18:25:19.329583  5562 FeSupport.java:285] 6e44c3c949a31be2:f973c7ff00000000] Requesting prioritized load of table(s): test_decimal_encodings_28d99c0e.decimal_stored_as_int32
> I0207 18:25:30.776041  5562 Frontend.java:1282] 6e44c3c949a31be2:f973c7ff00000000] Analysis finished.
> I0207 18:25:35.919486 10418 admission-controller.cc:608] 6e44c3c949a31be2:f973c7ff00000000] Schedule for id=6e44c3c949a31be2:f973c7ff00000000 in pool_name=default-pool per_host_mem_estimate=16.02 MB PoolConfig: max_requests=-1 max_queued=200 max_mem=-1.00 B
> I0207 18:25:35.919528 10418 admission-controller.cc:613] 6e44c3c949a31be2:f973c7ff00000000] Stats: agg_num_running=2, agg_num_queued=0, agg_mem_reserved=24.13 MB,  local_host(local_mem_admitted=1.99 GB, num_admitted_running=2, num_queued=0, backend_mem_reserved=8.06 MB)
> I0207 18:25:35.919549 10418 admission-controller.cc:645] 6e44c3c949a31be2:f973c7ff00000000] Admitted query id=6e44c3c949a31be2:f973c7ff00000000
> I0207 18:25:35.920532 10418 coordinator.cc:93] 6e44c3c949a31be2:f973c7ff00000000] Exec() query_id=6e44c3c949a31be2:f973c7ff00000000 stmt=select score from decimal_stored_as_int32
> I0207 18:25:35.930855 10418 coordinator.cc:359] 6e44c3c949a31be2:f973c7ff00000000] starting execution on 2 backends for query_id=6e44c3c949a31be2:f973c7ff00000000
> I0207 18:25:35.938108 21110 impala-internal-service.cc:50] 6e44c3c949a31be2:f973c7ff00000000] ExecQueryFInstances(): query_id=6e44c3c949a31be2:f973c7ff00000000 coord=xxx.com:22000 #instances=1
> I0207 18:25:36.037228 10571 query-state.cc:624] 6e44c3c949a31be2:f973c7ff00000000] Executing instance. instance_id=6e44c3c949a31be2:f973c7ff00000000 fragment_idx=0 per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=5
> I0207 18:25:48.149771 12581 coordinator-backend-state.cc:209] ExecQueryFInstances rpc query_id=6e44c3c949a31be2:f973c7ff00000000 failed: Failed to get minimum memory reservation of 8.00 KB on daemon xxx.com:22001 for query 6e44c3c949a31be2:f973c7ff00000000 due to following error: Memory limit exceeded: Could not allocate memory while trying to increase reservation.
> Query(6e44c3c949a31be2:f973c7ff00000000) could not allocate 8.00 KB without exceeding limit.
> Query(6e44c3c949a31be2:f973c7ff00000000): Reservation=0 ReservationLimit=9.60 GB OtherMemory=0 Total=0 Peak=0
> I0207 18:25:48.149895 10418 coordinator.cc:373] 6e44c3c949a31be2:f973c7ff00000000] started execution on 2 backends for query_id=6e44c3c949a31be2:f973c7ff00000000
> I0207 18:25:48.152803 10418 coordinator.cc:527] 6e44c3c949a31be2:f973c7ff00000000] ExecState: query id=6e44c3c949a31be2:f973c7ff00000000 finstance=N/A on host=xxx.com (EXECUTING -> ERROR) status=ExecQueryFInstances rpc query_id=6e44c3c949a31be2:f973c7ff00000000 failed: Failed to get minimum memory reservation of 8.00 KB on daemon xxx.com:22001 for query 6e44c3c949a31be2:f973c7ff00000000 due to following error: Memory limit exceeded: Could not allocate memory while trying to increase reservation.
> Query(6e44c3c949a31be2:f973c7ff00000000) could not allocate 8.00 KB without exceeding limit.
> Query(6e44c3c949a31be2:f973c7ff00000000): Reservation=0 ReservationLimit=9.60 GB OtherMemory=0 Total=0 Peak=0
> I0207 18:25:48.152827 10418 coordinator-backend-state.cc:453] 6e44c3c949a31be2:f973c7ff00000000] Sending CancelQueryFInstances rpc for query_id=6e44c3c949a31be2:f973c7ff00000000 backend=127.0.0.1:27000
> I0207 18:25:48.155086 12737 control-service.cc:168] CancelQueryFInstances(): query_id=6e44c3c949a31be2:f973c7ff00000000
> I0207 18:25:48.155109 12737 query-exec-mgr.cc:97] QueryState: query_id=6e44c3c949a31be2:f973c7ff00000000 refcnt=4
> I0207 18:25:48.155117 12737 query-state.cc:649] Cancel: query_id=6e44c3c949a31be2:f973c7ff00000000
> I0207 18:25:48.155129 12737 krpc-data-stream-mgr.cc:325] cancelling all streams for fragment_instance_id=6e44c3c949a31be2:f973c7ff00000000
> I0207 18:25:48.155297 10418 coordinator.cc:687] 6e44c3c949a31be2:f973c7ff00000000] CancelBackends() query_id=6e44c3c949a31be2:f973c7ff00000000, tried to cancel 1 backends
> I0207 18:25:48.155306 10418 coordinator.cc:859] 6e44c3c949a31be2:f973c7ff00000000] Release admission control resources for query_id=6e44c3c949a31be2:f973c7ff00000000
> I0207 18:25:48.170018 10571 krpc-data-stream-mgr.cc:294] 6e44c3c949a31be2:f973c7ff00000000] DeregisterRecvr(): fragment_instance_id=6e44c3c949a31be2:f973c7ff00000000, node=1
> I0207 18:25:48.197767  5562 impala-beeswax-server.cc:239] close(): query_id=6e44c3c949a31be2:f973c7ff00000000
> I0207 18:25:48.197775  5562 impala-server.cc:1142] UnregisterQuery(): query_id=6e44c3c949a31be2:f973c7ff00000000
> I0207 18:25:48.197779  5562 impala-server.cc:1249] Cancel(): query_id=6e44c3c949a31be2:f973c7ff00000000
> I0207 18:25:48.225905 10529 query-state.cc:272] 6e44c3c949a31be2:f973c7ff00000000] UpdateBackendExecState(): last report for 6e44c3c949a31be2:f973c7ff00000000
> I0207 18:25:48.225889 10571 query-state.cc:632] 6e44c3c949a31be2:f973c7ff00000000] Instance completed. instance_id=6e44c3c949a31be2:f973c7ff00000000 #in-flight=4 status=CANCELLED: Cancelled
> I0207 18:25:48.372977 12737 control-service.cc:125] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled): 6e44c3c949a31be2:f973c7ff00000000 remote host=127.0.0.1:50422
> I0207 18:25:48.373118 10529 query-state.cc:431] 6e44c3c949a31be2:f973c7ff00000000] Cancelling fragment instances as directed by the coordinator. Returned status: ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled): 6e44c3c949a31be2:f973c7ff00000000 remote host=127.0.0.1:50422
> I0207 18:25:48.373138 10529 query-state.cc:649] 6e44c3c949a31be2:f973c7ff00000000] Cancel: query_id=6e44c3c949a31be2:f973c7ff00000000
> I0207 18:25:48.429422  5562 query-exec-mgr.cc:184] ReleaseQueryState(): deleted query_id=6e44c3c949a31be2:f973c7ff00000000
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)