You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Michael Smith (Jira)" <ji...@apache.org> on 2023/04/21 21:18:00 UTC

[jira] [Work started] (IMPALA-12080) Test test_recover_many_partitions is very slow on S3, Ozone

     [ https://issues.apache.org/jira/browse/IMPALA-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on IMPALA-12080 started by Michael Smith.
----------------------------------------------
> Test test_recover_many_partitions is very slow on S3, Ozone
> -----------------------------------------------------------
>
>                 Key: IMPALA-12080
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12080
>             Project: IMPALA
>          Issue Type: Task
>          Components: Infrastructure
>            Reporter: Michael Smith
>            Assignee: Michael Smith
>            Priority: Major
>
> The test metadata/test_recover_partitions.py::TestRecoverPartitions::test_recover_many_partitions takes <2 minutes for HDFS runs, ~1 hour for Ozone and S3 tests. This appears to be because test setup invokes a filesystem client 1400 times, to create directories and files for 700 partitions. With HDFS that's fast because it uses [pywebhdfs|https://pypi.org/project/pywebhdfs/] to do it in-process, but for all other filesystems it defaults to the hdfs Java CLI.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org