You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Joe McDonnell (Jira)" <ji...@apache.org> on 2020/12/04 19:10:00 UTC

[jira] [Resolved] (IMPALA-9759) Revisit integration of snapshot dataload with s3guard

     [ https://issues.apache.org/jira/browse/IMPALA-9759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joe McDonnell resolved IMPALA-9759.
-----------------------------------
    Fix Version/s: Not Applicable
       Resolution: Won't Fix

S3 now has strong consistency, which will render S3Guard obsolete. No improvements are planned for this codepath, so I'm closing this.

> Revisit integration of snapshot dataload with s3guard
> -----------------------------------------------------
>
>                 Key: IMPALA-9759
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9759
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Infrastructure
>    Affects Versions: Impala 4.0
>            Reporter: Joe McDonnell
>            Priority: Critical
>              Labels: broken-build, flaky
>             Fix For: Not Applicable
>
>
> Sometimes, the s3 jobs (which use s3guard for consistency) sees test failures due to missing files from the dataload snapshot (see bottom). This may be related to the interaction of snapshot loading with s3guard. We should nail down exactly the right procedure for loading the snapshot. Currently, we do the following:
> 1. Remove any data from the s3bucket via the s3 commandline
> 2. Create the s3guard dynamodb table (or reuse existing one if a previous job failed without deleting the old dynamodb table)
> 3. Prune any existing entries from that table
> 4. Load the snapshot to the s3 bucket
> In theory, this leave s3guard with an empty dynamodb table and an s3bucket with data. As tests progress and try to access the s3 bucket, s3guard would see that there is no entry in the dynamodb table and then check the underlying s3 bucket.
> We need to revisit these steps and verify that everything is being done correctly.
> {noformat}
> metadata/test_metadata_query_statements.py:70: in test_show_stats
>     self.run_test_case('QueryTest/show-stats', vector, "functional")
> common/impala_test_suite.py:687: in run_test_case
>     self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:523: in __verify_results_and_errors
>     replace_filenames_with_placeholder)
> common/test_result_verifier.py:456: in verify_raw_results
>     VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:278: in verify_query_result_is_equal
>     assert expected_results == actual_results
> E assert Comparing QueryTestResults (expected vs actual):
> E '2009','1',310,1,'19.95KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=1&#39; == '2009','1',310,1,'19.95KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=1&#39;
> E '2009','10',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=10&#39; == '2009','10',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=10&#39;
> E '2009','11',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=11&#39; == '2009','11',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=11&#39;
> E '2009','12',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=12&#39; == '2009','12',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=12&#39;
> E '2009','2',280,1,'18.12KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=2&#39; == '2009','2',280,1,'18.12KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=2&#39;
> E '2009','3',310,1,'20.06KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=3&#39; == '2009','3',310,1,'20.06KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=3&#39;
> E '2009','4',300,1,'19.61KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=4&#39; == '2009','4',300,1,'19.61KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=4&#39;
> E '2009','5',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=5&#39; != '2009','5',0,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=5&#39;
> E '2009','6',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=6&#39; == '2009','6',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=6&#39;
> E '2009','7',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=7&#39; == '2009','7',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=7&#39;
> E '2009','8',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=8&#39; == '2009','8',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=8&#39;
> E '2009','9',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=9&#39; == '2009','9',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=9&#39;
> E '2010','1',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=1&#39; == '2010','1',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=1&#39;
> E '2010','10',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=10&#39; == '2010','10',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=10&#39;
> E '2010','11',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=11&#39; == '2010','11',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=11&#39;
> E '2010','12',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=12&#39; == '2010','12',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=12&#39;
> E '2010','2',280,1,'18.39KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=2&#39; == '2010','2',280,1,'18.39KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=2&#39;
> E '2010','3',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=3&#39; == '2010','3',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=3&#39;
> E '2010','4',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=4&#39; == '2010','4',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=4&#39;
> E '2010','5',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=5&#39; == '2010','5',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=5&#39;
> E '2010','6',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=6&#39; == '2010','6',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=6&#39;
> E '2010','7',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=7&#39; == '2010','7',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=7&#39;
> E '2010','8',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=8&#39; == '2010','8',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=8&#39;
> E '2010','9',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=9&#39; == '2010','9',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=9&#39;
> E 'Total','',7300,24,'478.45KB','0B','','','','' != 'Total','',6990,24,'478.45KB','0B','','','',''
> {noformat}
> This also shows up in cardinality calculations:
> {noformat}
> metadata/test_explain.py:113: in test_explain_validate_cardinality_estimates
>     check_cardinality(result.data, '7.30K')
> metadata/test_explain.py:98: in check_cardinality
>     query_result, expected_cardinality=expected_cardinality)
> metadata/test_explain.py:86: in check_row_size_and_cardinality
>     assert m.groups()[1] == expected_cardinality
> E assert '6.99K' == '7.30K'
> E - 6.99K
> E + 7.30K
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)