You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2020/05/29 09:38:00 UTC

[jira] [Created] (IMPALA-9798) TestScratchDir.test_multiple_dirs fails to start impalad

Quanlong Huang created IMPALA-9798:
--------------------------------------

             Summary: TestScratchDir.test_multiple_dirs fails to start impalad
                 Key: IMPALA-9798
                 URL: https://issues.apache.org/jira/browse/IMPALA-9798
             Project: IMPALA
          Issue Type: Bug
            Reporter: Quanlong Huang
             Fix For: Impala 4.0
         Attachments: impalad.impala-ec2-centos74-m5-4xlarge-ondemand-0b8f.vpc.cloudera.com.jenkins.log.INFO.20200528-151451.15245

Saw in an exhaustive job:
Stacktrace:
{code}
custom_cluster/test_scratch_disk.py:97: in test_multiple_dirs
    '--impalad_args=--disk_spill_punch_holes=true'])
common/custom_cluster_test_suite.py:277: in _start_impala_cluster
    check_call(cmd + options, close_fds=True)
/data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/python-2.7.16/lib/python2.7/subprocess.py:190: in check_call
    raise CalledProcessError(retcode, cmd)
E   CalledProcessError: Command '['/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/start-impala-cluster.py', '--state_store_args=--statestore_update_frequency_ms=50     --statestore_priority_update_frequency_ms=50     --statestore_heartbeat_frequency_ms=50', '--cluster_size=3', '--num_coordinators=3', '--log_dir=/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests', '--log_level=1', '--impalad_args=-logbuflevel=-1 -scratch_dirs=/tmp/tmpR006lp,/tmp/tmpzKVBYt,/tmp/tmpBLcN_O,/tmp/tmp6kqoj5,/tmp/tmpT_R39r', '--impalad_args=--allow_multiple_scratch_dirs_per_device=false', '--impalad_args=--disk_spill_compression_codec=zstd', '--impalad_args=--disk_spill_punch_holes=true', '--impalad_args=--default_query_options=']' returned non-zero exit status 1
{code}
Standard Output:
{code}
Generated dir/tmp/tmpR006lp
Generated dir/tmp/tmpzKVBYt
Generated dir/tmp/tmpBLcN_O
Generated dir/tmp/tmp6kqoj5
Generated dir/tmp/tmpT_R39r
{code}
Standard Error:
{code}
15:14:51 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es)
15:14:51 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/statestored.INFO
15:14:51 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
15:14:51 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad.INFO
15:14:51 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
15:14:51 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
15:14:54 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
15:14:54 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
15:14:54 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-0b8f.vpc.cloudera.com:25000
15:14:54 MainThread: 'backends'
15:14:54 MainThread: Waiting for num_known_live_backends=3. Current value: None
15:14:55 MainThread: Found 2 impalad/1 statestored/1 catalogd process(es)
15:14:55 MainThread: Error starting cluster
Traceback (most recent call last):
  File "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/start-impala-cluster.py", line 770, in <module>
    expected_cluster_size - expected_catalog_delays)
  File "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_cluster.py", line 186, in wait_until_ready
    early_abort_fn=check_processes_still_running)
  File "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_service.py", line 284, in wait_for_num_known_live_backends
    early_abort_fn()
  File "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_cluster.py", line 178, in check_processes_still_running
    assert len(self.impalads) >= expected_num_impalads
AssertionError
DEBUG:impala_cluster:Found 2 impalad/1 statestored/1 catalogd process(es)
{code}

Looking into the crashed impalad's log:
{code}
I0528 15:14:54.587469 15245 tmp-file-mgr.cc:229] Using scratch directory /tmp/tmpR006lp/impala-scratch on disk 0 limit: 8589934592.00 GB
I0528 15:14:54.648952 15245 status.cc:129] Failed to get post-punch file size: Not found: /tmp/tmpR006lp/impala-scratch/88432f73256ff458:c620697eade771bb: No such file or directory (error 2)
    @          0x1d5b072  impala::Status::Status()
    @          0x264fa09  impala::FileSystemUtil::CheckHolePunch()
    @          0x22e6947  impala::TmpFileMgr::InitCustom()
    @          0x22e59a4  impala::TmpFileMgr::InitCustom()
    @          0x22e58f0  impala::TmpFileMgr::Init()
    @          0x248835c  impala::ImpalaServer::ImpalaServer()
    @          0x2483a34  ImpaladMain()
    @          0x1d048af  main
    @     0x7f23f3a64c04  __libc_start_main
    @          0x1d04726  (unknown)
E0528 15:14:54.649108 15245 impala-server.cc:394] Failed to get post-punch file size: Not found: /tmp/tmpR006lp/impala-scratch/88432f73256ff458:c620697eade771bb: No such file or directory (error 2)
E0528 15:14:54.649127 15245 impala-server.cc:397] Aborting Impala Server startup due to improperly configured scratch directories.. Impalad exiting.
{code}
It looks like the scratch dir is not created successfully.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)