You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Sahil Takiar (Jira)" <ji...@apache.org> on 2020/03/02 16:20:00 UTC
[jira] [Commented] (IMPALA-9439) Add kudu support to
single_node_perf_run.py
[ https://issues.apache.org/jira/browse/IMPALA-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17049373#comment-17049373 ]
Sahil Takiar commented on IMPALA-9439:
--------------------------------------
Turns out the correct way to run this against Kudu is:
{code}
./bin/single_node_perf_run.py 984f675e05d83f048afbee5b8a78413b2a624d13 --table_formats=kudu/none/none --load --workloads tpch --scale 5
{code}
AFAICT the scale option is always necessary, otherwise the db resolves to something like {{tpchNone_kudu}} which makes it look like there is a bug in the code. I think we can improve the docs of {{single_node_perf_run}} and just make the scale option mandatory.
> Add kudu support to single_node_perf_run.py
> -------------------------------------------
>
> Key: IMPALA-9439
> URL: https://issues.apache.org/jira/browse/IMPALA-9439
> Project: IMPALA
> Issue Type: Improvement
> Reporter: Sahil Takiar
> Assignee: Sahil Takiar
> Priority: Major
>
> Running {{bin/single_node_perf_run.py}} with {{--table_formats kudu}} currently fails with the following error:
> {code:java}
> Traceback (most recent call last):
> File "./bin/run-workload.py", line 257, in <module>
> workload_runners.append(WorkloadRunner(workload, scale_factor, config))
> File "./tests/performance/workload_runner.py", line 71, in __init__
> self._generate_test_vectors()
> File "./tests/performance/workload_runner.py", line 88, in _generate_test_vectors
> self._test_vectors.append(TableFormatInfo.create_from_string(dataset, tf))
> File "./tests/common/test_dimensions.py", line 70, in create_from_string
> raise ValueError, 'Invalid table format %s' % table_format_string
> ValueError: Invalid table format kudu
> Traceback (most recent call last):
> File "./bin/single_node_perf_run.py", line 348, in <module>
> main()
> File "./bin/single_node_perf_run.py", line 338, in main
> perf_ab_test(options, args)
> File "./bin/single_node_perf_run.py", line 230, in perf_ab_test
> build(hash_a, options)
> File "./bin/single_node_perf_run.py", line 111, in build
> configured_call(buildall)
> File "./bin/single_node_perf_run.py", line 87, in configured_call
> return subprocess.check_call(["bash", "-c", cmd])
> File "/usr/lib/python2.7/subprocess.py", line 541, in check_call
> raise CalledProcessError(retcode, cmd)
> subprocess.CalledProcessError: Command '['bash', '-c', 'source ./bin/impala-config.sh && ./buildall.sh -notests -release -noclean']' returned non-zero exit status 2{code}
> {{kudu/none}} and {{kudu/none/none}} fail as well with this error:
> {code:java}
> AnalysisException: Database does not exist: tpchNone_kudu {code}
> I think there is some handling of compression types / codecs that aren't applicable for Kudu, which are preventing this from working properly.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org