You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Csaba Ringhofer (Jira)" <ji...@apache.org> on 2023/05/05 07:00:00 UTC

[jira] [Comment Edited] (IMPALA-10585) retry_failed_queries=true should not apply to DMLs

    [ https://issues.apache.org/jira/browse/IMPALA-10585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17719692#comment-17719692 ] 

Csaba Ringhofer edited comment on IMPALA-10585 at 5/5/23 6:59 AM:
------------------------------------------------------------------

Small brain dump about the possible support for retrying DML queries:
(only considering filesystem based tables, not HBase / Kudu)
 
The question is whether there is some kind of commit phase on the coordinator - if yes, and without commit there are no side effects on the table, then retry seems straightforward:
- if an executor fails before the commit phase then the coordinator will not commit the write and can retry it without any side-effect
- if all executors succeeded then the coordinator can proceed with the commit as it cannot receive an "executor failed" event anymore

The case where this doesn't seem possible are external tables on object stores where the files are written to their final destination (S3_SKIP_INSERT_STAGING). The files written can be immediately visible to other readers so this case is dirty by design, the best is to return an error when an executor.

For other table types it seems possible:
- external tables: the commit phase is moving the files from staging dir to their final phase - if an executor fails, the files already written to staging dir can be ignored / deleted and write can be retried
- Hive ACID tables: the query can be aborted in case of failure and the new trial will have to get a new transaction and write id 
- Iceberg tables: this seems the easiest as Iceberg has atomic commit and doesn't have to deal with transaction and write id like Hive ACID 

Based on manual tests Iceberg seemed to already work correctly (before disabling retry for DML statements). So for Iceberg there may be no code changes needed but adding test coverage is still needed.


was (Author: csringhofer):
Small brain dump about the possible support for retrying DML queries:
(only considering filesystem based tables, not HBase / Kudu)
 
The question is whether there is some kind of commit phase on the coordinator - if yes, and without commit there are no side effects on the table, then retry seems straightforward:
- if an executor fails before the commit phase then the coordinator will not commit the write and can retry it without any side-effect
- if all executors succeeded then the coordinator can proceed with the commit as it cannot receive an "executor failed" event anymore

The case where this doesn't seem possible are external tables on object stores where the files are written to their final destination (S3_SKIP_INSERT_STAGING). The files written can be immediately visible to other readers so this case is dirty by design, the best is to return an error when an executor.

For other table types it seems possible:
- external tables: the commit phase is moving the files from staging dir to their final phase - if an executor fails, the files already written to staging dir can be ignored / deleted and write can be retried
- Hive ACID tables: the query can be aborted in case of failure and the new trial will have to get a new transaction and write id 
- Iceberg tables: this seems the easiest as Iceberg has atomic commit and doesn't have to deal with transaction and write id like Hive ACID 


> retry_failed_queries=true should not apply to DMLs
> --------------------------------------------------
>
>                 Key: IMPALA-10585
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10585
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 4.0.0
>            Reporter: Joe McDonnell
>            Priority: Major
>
> I noticed that retry_failed_queries=true will retry insert statements:
> {noformat}
> [localhost:21050] joetest> insert into retrytest select count(*) from functional.alltypes where bool_col = sleep(50);
> Query: insert into retrytest select count(*) from functional.alltypes where bool_col = sleep(50)
> Query submitted at: 2021-03-15 10:23:32 (Coordinator: http://joemcdonnell:25000)
> Query progress can be monitored at: http://joemcdonnell:25000/query_plan?query_id=5f4b8c0224faa31a:4a585cf700000000
> ...
> Failed due to unreachable impalad(s): joemcdonnell:27002                                                        00000000
> Retried query link: http://joemcdonnell:25000/query_plan?query_id=824b6b103ea68ea3:bc804b4Failed due to unreachable impalad(s): joemcdonnell:27002
> ...
> Query has been retried using query id: 824b6b103ea68ea3:bc804b4500000000                  Retried query link: http://joemcdonnell:25000/query_plan?query_id=824b6b103ea68ea3:bc804b4500000000
> Modified 1 row(s) in 47.71s{noformat}
> I don't think this was intended to work, because https://issues.apache.org/jira/browse/IMPALA-9734 was closed saying that we don't do retries for write statements. There also aren't any tests for these cases.
> I think we intended to exempt DML statements from retry_failed_queries=true. We should implement that and add tests to make sure DMLs don't get retried.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org