You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/11/17 21:42:00 UTC

[jira] [Assigned] (IMPALA-9759) Revisit integration of snapshot dataload with s3guard

     [ https://issues.apache.org/jira/browse/IMPALA-9759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Armstrong reassigned IMPALA-9759:
-------------------------------------

    Assignee:     (was: Sahil Takiar)

> Revisit integration of snapshot dataload with s3guard
> -----------------------------------------------------
>
>                 Key: IMPALA-9759
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9759
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Infrastructure
>    Affects Versions: Impala 4.0
>            Reporter: Joe McDonnell
>            Priority: Critical
>              Labels: broken-build, flaky
>
> Sometimes, the s3 jobs (which use s3guard for consistency) sees test failures due to missing files from the dataload snapshot (see bottom). This may be related to the interaction of snapshot loading with s3guard. We should nail down exactly the right procedure for loading the snapshot. Currently, we do the following:
> 1. Remove any data from the s3bucket via the s3 commandline
> 2. Create the s3guard dynamodb table (or reuse existing one if a previous job failed without deleting the old dynamodb table)
> 3. Prune any existing entries from that table
> 4. Load the snapshot to the s3 bucket
> In theory, this leave s3guard with an empty dynamodb table and an s3bucket with data. As tests progress and try to access the s3 bucket, s3guard would see that there is no entry in the dynamodb table and then check the underlying s3 bucket.
> We need to revisit these steps and verify that everything is being done correctly.
> {noformat}
> metadata/test_metadata_query_statements.py:70: in test_show_stats
>     self.run_test_case('QueryTest/show-stats', vector, "functional")
> common/impala_test_suite.py:687: in run_test_case
>     self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:523: in __verify_results_and_errors
>     replace_filenames_with_placeholder)
> common/test_result_verifier.py:456: in verify_raw_results
>     VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:278: in verify_query_result_is_equal
>     assert expected_results == actual_results
> E assert Comparing QueryTestResults (expected vs actual):
> E '2009','1',310,1,'19.95KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=1&#39; == '2009','1',310,1,'19.95KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=1&#39;
> E '2009','10',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=10&#39; == '2009','10',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=10&#39;
> E '2009','11',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=11&#39; == '2009','11',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=11&#39;
> E '2009','12',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=12&#39; == '2009','12',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=12&#39;
> E '2009','2',280,1,'18.12KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=2&#39; == '2009','2',280,1,'18.12KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=2&#39;
> E '2009','3',310,1,'20.06KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=3&#39; == '2009','3',310,1,'20.06KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=3&#39;
> E '2009','4',300,1,'19.61KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=4&#39; == '2009','4',300,1,'19.61KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=4&#39;
> E '2009','5',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=5&#39; != '2009','5',0,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=5&#39;
> E '2009','6',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=6&#39; == '2009','6',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=6&#39;
> E '2009','7',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=7&#39; == '2009','7',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=7&#39;
> E '2009','8',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=8&#39; == '2009','8',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=8&#39;
> E '2009','9',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=9&#39; == '2009','9',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=9&#39;
> E '2010','1',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=1&#39; == '2010','1',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=1&#39;
> E '2010','10',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=10&#39; == '2010','10',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=10&#39;
> E '2010','11',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=11&#39; == '2010','11',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=11&#39;
> E '2010','12',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=12&#39; == '2010','12',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=12&#39;
> E '2010','2',280,1,'18.39KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=2&#39; == '2010','2',280,1,'18.39KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=2&#39;
> E '2010','3',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=3&#39; == '2010','3',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=3&#39;
> E '2010','4',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=4&#39; == '2010','4',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=4&#39;
> E '2010','5',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=5&#39; == '2010','5',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=5&#39;
> E '2010','6',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=6&#39; == '2010','6',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=6&#39;
> E '2010','7',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=7&#39; == '2010','7',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=7&#39;
> E '2010','8',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=8&#39; == '2010','8',310,1,'20.36KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=8&#39;
> E '2010','9',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=9&#39; == '2010','9',300,1,'19.71KB','NOT CACHED','NOT CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=9&#39;
> E 'Total','',7300,24,'478.45KB','0B','','','','' != 'Total','',6990,24,'478.45KB','0B','','','',''
> {noformat}
> This also shows up in cardinality calculations:
> {noformat}
> metadata/test_explain.py:113: in test_explain_validate_cardinality_estimates
>     check_cardinality(result.data, '7.30K')
> metadata/test_explain.py:98: in check_cardinality
>     query_result, expected_cardinality=expected_cardinality)
> metadata/test_explain.py:86: in check_row_size_and_cardinality
>     assert m.groups()[1] == expected_cardinality
> E assert '6.99K' == '7.30K'
> E - 6.99K
> E + 7.30K
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org