You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Quanlong Huang (JIRA)" <ji...@apache.org> on 2019/07/16 21:54:00 UTC
[jira] [Resolved] (IMPALA-8486) test_udf_update_via_drop and
test_udf_update_via_create fail on local catalog
[ https://issues.apache.org/jira/browse/IMPALA-8486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Quanlong Huang resolved IMPALA-8486.
------------------------------------
Resolution: Fixed
Fix Version/s: Impala 3.3.0
> test_udf_update_via_drop and test_udf_update_via_create fail on local catalog
> -----------------------------------------------------------------------------
>
> Key: IMPALA-8486
> URL: https://issues.apache.org/jira/browse/IMPALA-8486
> Project: IMPALA
> Issue Type: Improvement
> Components: Catalog
> Affects Versions: Impala 3.3.0
> Reporter: Tim Armstrong
> Assignee: Quanlong Huang
> Priority: Critical
> Labels: catalog-v2
> Fix For: Impala 3.3.0
>
>
> {noformat}
> TestUdfTargeted.test_udf_update_via_drop[protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: text/none]
> tests/query_test/test_udfs.py:541: in test_udf_update_via_drop
> self._run_query_all_impalads(exec_options, query_stmt, ["New UDF"])
> tests/query_test/test_udfs.py:52: in _run_query_all_impalads
> assert result.data == expected
> E assert ['Old UDF'] == ['New UDF']
> E At index 0 diff: 'Old UDF' != 'New UDF'
> E Full diff:
> E - ['Old UDF']
> E + ['New UDF']
> ----------------------------
> {noformat}
> The tests are checking that the local UDF caches on each impalad get invalidated by a drop/create of a function referencing the HDFS file containing the UDF. The test fails because the local catalog, unlike the regular catalog, doesn't invalidate LibCache entries upon receiving a catalog update.
> I looked at this for long enough to realise that the invalidation mechanism is fundamentally broken - it doesn't work with dedicated executors. It also creates a race between the statestore updates and queries referencing the UDFs - if the queries win the race, then they can incorrectly use the old version that should have been invalidated.
> I think this is a potentially problematic issue because old JAR/SO versions could persist in the cache indefinitely if old versions are overwritten in place.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)