You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/03/13 13:02:00 UTC

[jira] [Commented] (IMPALA-10561) Add support for spilling to GCS

    [ https://issues.apache.org/jira/browse/IMPALA-10561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300851#comment-17300851 ] 

ASF subversion and git services commented on IMPALA-10561:
----------------------------------------------------------

Commit 2dfc68d85277f05bf20c09e31dd10c9474ada62c in impala's branch refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=2dfc68d ]

IMPALA-7712: Support Google Cloud Storage

This patch adds support for GCS(Google Cloud Storage). Using the
gcs-connector, the implementation is similar to other remote
FileSystems.

New flags for GCS:
 - num_gcs_io_threads: Number of GCS I/O threads. Defaults to be 16.

Follow-up:
 - Support for spilling to GCS will be addressed in IMPALA-10561.
 - Support for caching GCS file handles will be addressed in
   IMPALA-10568.
 - test_concurrent_inserts and test_failing_inserts in
   test_acid_stress.py are skipped due to slow file listing on
   GCS (IMPALA-10562).
 - Some tests are skipped due to issues introduced by /etc/hosts setting
   on GCE instances (IMPALA-10563).

Tests:
 - Compile and create hdfs test data on a GCE instance. Upload test data
   to a GCS bucket. Modify all locations in HMS DB to point to the GCS
   bucket. Remove some hdfs caching params. Run CORE tests.
 - Compile and load snapshot data to a GCS bucket. Run CORE tests.

Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b
Reviewed-on: http://gerrit.cloudera.org:8080/17121
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Add support for spilling to GCS
> -------------------------------
>
>                 Key: IMPALA-10561
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10561
>             Project: IMPALA
>          Issue Type: New Feature
>            Reporter: Quanlong Huang
>            Priority: Major
>
> After IMPALA-7712 and IMPALA-9828 are resolved, we can add support for spilling to GCS as well. Maybe few code changes are needed. We need to figure out the corresponding options for GCS connections similar to "fs.s3a.fast.upload" and "fs.s3a.fast.upload.buffer" for S3 connections. Also need to add tests and metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org