You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/06/05 00:23:00 UTC
[jira] [Commented] (IMPALA-9896) OutOfMemoryError: Requested array size exceeds VM limit when LocalCatalog is enabled

    [ https://issues.apache.org/jira/browse/IMPALA-9896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17357731#comment-17357731 ] 

ASF subversion and git services commented on IMPALA-9896:
---------------------------------------------------------

Commit bb3062197b134f33e2796fac603e3367ab8bef1a in impala's branch refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=bb30621 ]

IMPALA-7501: Slim down partition metadata in LocalCatalog mode

In LocalCatalog mode, the coordinator caches HMS partition objects in
its local cache. HMS partition contains many fields that Impala won't
used and some fields that are suboptimal. Most of them are in the
StorageDescriptor:
 - partition-level schema (list<FieldSchema>) which is never used.
 - location string which can be prefix-compressed since the prefix is
   usually the table location.
 - input/outputFormat string which is always duplicated and can be
   represented by an integer(enum).
An experiment on a large table with 478 columns and 87320 partitions
(one non-empty file per partition) shows that more than 90% of the
memory space (1.1GB) are occupied by these fields. The dominant part is
the partition-level schema which consumed 76% of the cache.

On the other hand, these unused or suboptimal fields are got in one
response from catalogd, wrapped in TPartialPartitionInfo which finally
belongs to a TGetPartialCatalogObjectResponse. They dramatically
increase the serialized thrift object size of the response, which has a
2GB array size limit in JVM. Fetching metadata of many partitions from
catalogd could cause it runs into OOM error that hits the 2GB limit
(e.g. IMPALA-9896).

This patch extracts the HMS partition object and replaces it with the
fields that Impala actually uses. In the LocalCatalog cache, the HMS
partition object is replaced with
 - hms parameters
 - write id
 - HdfsStorageDescriptor which represents the input/output format and
   some delimiters.
 - prefix-compressed location
The hms_partition field of TPartialPartitionInfo is also extracted with
corresponding fields. However, CatalogHmsAPIHelper still requires the
whole hms partition object. So the hms_partition field is kept for its
usage. To distinguish the different requirements, we add a new field,
want_hms_partition in TTableInfoSelector. The existing
'want_partition_metadata' field means returning these extracted fields,
and the 'want_hms_partition' field means returning the whole HMS
partition object.

Improvement results in the above case:
 - reduce the heap usage from 1.1GB to 113.2MB, objects from 41m to 2.3m
 - reduce the response size from 1.7GB to 28.41MB.

Tests:
 - Run local-catalog related tests locally
 - Run CORE tests

Change-Id: I307e7a8193b54a7b3ab93d9ebd194766bbdbd977
Reviewed-on: http://gerrit.cloudera.org:8080/17505
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Reviewed-by: Vihang Karajgaonkar <vi...@cloudera.com>


> OutOfMemoryError: Requested array size exceeds VM limit when LocalCatalog is enabled
> ------------------------------------------------------------------------------------
>
>                 Key: IMPALA-9896
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9896
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>    Affects Versions: Impala 3.2.0
>            Reporter: abeltian
>            Assignee: guojingfeng
>            Priority: Critical
>              Labels: crashed, performance
>
> OutOfMemoryError: Requested array size exceeds VM limit when LocalCatalog is enabled.  
> The basic information of the large table is as follows:
> 101 columns, 785243 partitions, 5729866 files.
> {code:java}
> I0626 20:59:04.029678 3392438 jni-util.cc:256] java.lang.OutOfMemoryError: Requested array size exceeds VM limit
> I0626 20:59:04.030231 3392438 status.cc:124] OutOfMemoryError: Requested array size exceeds VM limit
>     @           0xb35f19
>     @          0x113112e
>     @           0xb23b87
>     @           0xb0e339
>     @           0xc15a52
>     @           0xc09e4c
>     @           0xb01de9
>     @           0xf159e8
>     @           0xf0cd7e
>     @           0xf0dc11
>     @          0x11a1e3f
>     @          0x11a29e9
>     @          0x1790be9
>     @     0x7f55188a2e24
>     @     0x7f55185cf35c
> E0626 20:59:04.030258 3392438 catalog-server.cc:176] OutOfMemoryError: Requested array size exceeds VM limit
> {code}
>  The source code corresponding to the error is as follows：
> {code:java}
> void GetPartialCatalogObject(TGetPartialCatalogObjectResponse& resp,
>       const TGetPartialCatalogObjectRequest& req) override {    
>       // TODO(todd): capture detailed metrics on the types of inbound requests, lock
>       // wait times, etc.
>       // TODO(todd): add some kind of limit on the number of concurrent requests here 
>       // to avoid thread exhaustion -- eg perhaps it would be best to use a trylock
>       // on the catalog locks, or defer these calls to a separate (bounded) queue,
>       // so a heavy query workload against a table undergoing a slow refresh doesn't
>       // end up taking down the catalog by creating thousands of threads.
>           VLOG_RPC << "GetPartialCatalogObject(): request=" << ThriftDebugString(req);
>           Status status = catalog_server_->catalog()->GetPartialCatalogObject(req, &resp);
>           if (!status.ok()) LOG(ERROR) << status.GetDetail(); //catalog-server.cc:176
>           TStatus thrift_status;
>           status.ToThrift(&thrift_status);
>           resp.__set_status(thrift_status);
>           VLOG_RPC << "GetPartialCatalogObject(): response=" << ThriftDebugString(resp);
>    }
> {code}
> https://issues.apache.org/jira/browse/IMPALA-7436 
> The following code will still load all partitions: 
> {code:java}
> //org.apache.impala.catalog.local.LocalFsTable
> @Override  public long getTotalHdfsBytes() {    
>         // TODO(todd): this is slow because it requires loading all partitions. Remove if possible.   
>         long size = 0;    
>         for (FeFsPartition p: loadPartitions(getPartitionIds())) {
>               size += p.getSize();    
>         }   
>         return size;  
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org