You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Joe McDonnell (JIRA)" <ji...@apache.org> on 2017/08/18 23:44:00 UTC

[jira] [Resolved] (IMPALA-5809) test_breakpad.py failing on exhaustive builds

     [ https://issues.apache.org/jira/browse/IMPALA-5809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joe McDonnell resolved IMPALA-5809.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 2.10.0

IMPALA-5809: Relax max_minidumps in breakpad test
The change to address IMPALA-5769 added periodic cleaning for minidumps,
which got in the way of the other minidump tests.
This change sets max_minidumps to the default value (9) for all tests to
keep the cleanup thread from interfering, and then sets a smaller limit
where needed.
Change-Id: I977930ae87b8d4671a89c1e07ba76b12eb92fa55
Reviewed-on: http://gerrit.cloudera.org:8080/7716
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins

> test_breakpad.py failing on exhaustive builds
> ---------------------------------------------
>
>                 Key: IMPALA-5809
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5809
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 2.10.0
>            Reporter: Joe McDonnell
>            Assignee: Lars Volker
>            Priority: Critical
>              Labels: broken-build
>             Fix For: Impala 2.10.0
>
>
> Multiple recent builds hit issues on test_breakpad.py. Two are specific to test_sigusr1_writes_minidump. This may be related to recent changes in IMPALA-5769. One shows issues in other parts of the test.
> Here is the dumping from the test_sigusr1_writes_minidump failure:
> {code}
> 10:59:35 __________________ TestBreakpad.test_sigusr1_writes_minidump ___________________
> 10:59:35 
> 10:59:35 self = <test_breakpad.TestBreakpad object at 0x4ce7bd0>
> 10:59:35 
> 10:59:35     @pytest.mark.execute_serially
> 10:59:35     def test_sigusr1_writes_minidump(self):
> 10:59:35       """Check that when a daemon receives SIGUSR1, it writes a minidump file."""
> 10:59:35       assert self.count_all_minidumps() == 0
> 10:59:35       self.start_cluster()
> 10:59:35       assert self.count_all_minidumps() == 0
> 10:59:35       cluster_size = self.get_num_processes('impalad')
> 10:59:35       self.kill_cluster(SIGUSR1)
> 10:59:35       # Breakpad forks to write its minidump files, wait for all the clones to terminate.
> 10:59:35       assert self.wait_for_num_processes('impalad', cluster_size) == cluster_size
> 10:59:35       assert self.wait_for_num_processes('catalogd', 1) == 1
> 10:59:35       assert self.wait_for_num_processes('statestored', 1) == 1
> 10:59:35       # Make sure impalad still answers queries.
> 10:59:35       client = self.create_impala_client()
> 10:59:35       self.execute_query_expect_success(client, "SELECT COUNT(*) FROM functional.alltypes")
> 10:59:35       # Kill the cluster. Sending SIGKILL will not trigger minidumps to be written.
> 10:59:35       self.kill_cluster(SIGKILL)
> 10:59:35 >     self.assert_num_minidumps_for_all_daemons(cluster_size)
> 10:59:35 
> 10:59:35 custom_cluster/test_breakpad.py:182: 
> 10:59:35 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> 10:59:35 
> 10:59:35 self = <test_breakpad.TestBreakpad object at 0x4ce7bd0>, cluster_size = 3
> 10:59:35 base_dir = None
> 10:59:35 
> 10:59:35     def assert_num_minidumps_for_all_daemons(self, cluster_size, base_dir=None):
> 10:59:35       self.assert_num_logfile_entries(1)
> 10:59:35 >     assert self.count_minidumps('impalad', base_dir) == cluster_size
> 10:59:35 E     assert 2 == 3
> 10:59:35 E      +  where 2 = <bound method TestBreakpad.count_minidumps of <test_breakpad.TestBreakpad object at 0x4ce7bd0>>('impalad', None)
> 10:59:35 E      +    where <bound method TestBreakpad.count_minidumps of <test_breakpad.TestBreakpad object at 0x4ce7bd0>> = <test_breakpad.TestBreakpad object at 0x4ce7bd0>.count_minidumps
> 10:59:35 
> 10:59:35 custom_cluster/test_breakpad.py:145: AssertionError
> {code}
> The other failure has this output:
> {code}
> 04:46:40 _____________________ TestBreakpad.test_minidump_creation ______________________
> 04:46:40 
> 04:46:40 self = <test_breakpad.TestBreakpad object at 0x54399d0>
> 04:46:40 
> 04:46:40     @pytest.mark.execute_serially
> 04:46:40     def test_minidump_creation(self):
> 04:46:40       """Check that when a daemon crashes, it writes a minidump file."""
> 04:46:40       assert self.count_all_minidumps() == 0
> 04:46:40       self.start_cluster()
> 04:46:40       assert self.count_all_minidumps() == 0
> 04:46:40       cluster_size = self.get_num_processes('impalad')
> 04:46:40       self.kill_cluster(SIGSEGV)
> 04:46:40 >     self.assert_num_minidumps_for_all_daemons(cluster_size)
> 04:46:40 
> 04:46:40 custom_cluster/test_breakpad.py:163: 
> 04:46:40 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> 04:46:40 custom_cluster/test_breakpad.py:144: in assert_num_minidumps_for_all_daemons
> 04:46:40     self.assert_num_logfile_entries(1)
> 04:46:40 custom_cluster/test_breakpad.py:151: in assert_num_logfile_entries
> 04:46:40     expected_count=expected_count)
> 04:46:40 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> 04:46:40 
> 04:46:40 self = <test_breakpad.TestBreakpad object at 0x54399d0>, level = 'INFO'
> 04:46:40 line_regex = 'Wrote minidump to ', expected_count = 1
> 04:46:40 
> 04:46:40     def assert_impalad_log_contains(self, level, line_regex, expected_count=1):
> 04:46:40       """
> 04:46:40         Assert that impalad log with specified level (e.g. ERROR, WARNING, INFO) contains
> 04:46:40         expected_count lines with a substring matching the regex. When using this method to
> 04:46:40         check log files of running processes, the caller should make sure that log buffering
> 04:46:40         has been disabled, for example by adding '-logbuflevel=-1' to the daemon startup
> 04:46:40         options.
> 04:46:40         """
> 04:46:40       pattern = re.compile(line_regex)
> 04:46:40       found = 0
> 04:46:40       log_file_path = os.path.join(self.impala_log_dir, "impalad." + level)
> 04:46:40       # Resolve symlinks to make finding the file easier.
> 04:46:40       log_file_path = os.path.realpath(log_file_path)
> 04:46:40       with open(log_file_path) as log_file:
> 04:46:40         for line in log_file:
> 04:46:40           if pattern.search(line):
> 04:46:40             found += 1
> 04:46:40       assert found == expected_count, ("Expected %d lines in file %s matching regex '%s'"
> 04:46:40           + ", but found %d lines. Last line was: \n%s") % (expected_count, log_file_path,
> 04:46:40 >                                                           line_regex, found, line)
> 04:46:40 E     AssertionError: Expected 1 lines in file /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/impalad.impala-boost-static-burst-slave-105e.vpc.cloudera.com.jenkins.log.INFO.20170817-034950.31592 matching regex 'Wrote minidump to ', but found 2 lines. Last line was: 
> 04:46:40 E     Wrote minidump to /tmp/tmpIwVjcx/impalad/2a84fe94-2f54-4835-2e835cae-7c485c64.dmp
> 04:46:40 
> 04:46:40 common/custom_cluster_test_suite.py:159: AssertionError
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)