You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/04/13 07:24:00 UTC

[jira] [Commented] (IMPALA-12008) End Time in profile is set before DML finishes

    [ https://issues.apache.org/jira/browse/IMPALA-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17711718#comment-17711718 ] 

ASF subversion and git services commented on IMPALA-12008:
----------------------------------------------------------

Commit 32aeeecc078015263a1282f631acde4df6bd789c in impala's branch refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=32aeeecc0 ]

IMPALA-12008: Fix incorrect end time in DML profiles

The end time in DML profiles is incorrect that it's actually the time
when admission control resources are released. This is correct for
normal queries. But for DMLs, coordinator still needs to invoke the
updateCatalog RPC of catalogd to finalize the HMS update. The end time
should be set after the request finished.

This patch fixes the DML end time by not setting it after the admission
control resources are released. Instead, it's set after
ClientRequestState::WaitInternal() finishes, which makes sure the
updateCatalog RPC has finished.

Also adds a duration field in profile by the way.

For testing, this patch also adds a new debug action in catalogd
(catalogd_insert_finish_delay) to inject delays in updateCatalog.

Tests
 - Added e2e test to verify the end time of a DML profile

Change-Id: I9c5dc92c2f8576ceed374d447c0ac05022a2dee6
Reviewed-on: http://gerrit.cloudera.org:8080/19644
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> End Time in profile is set before DML finishes
> ----------------------------------------------
>
>                 Key: IMPALA-12008
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12008
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 3.2.0, Impala 4.0.0, Impala 3.3.0, Impala 3.4.0, Impala 3.4.1, Impala 4.1.0, Impala 4.2.0, Impala 4.1.1
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>              Labels: observability
>         Attachments: insert_profile.txt
>
>
> The runtime profile for DML (e.g. INSERT) has incorrect "End Time" that it's actually the time of DML data written. The DML might still running in updating hive metastore.
> Attached a profile as an example:
> {code:java}
> Query (id=b64d22b7100e40ea:5345525100000000):
>     Start Time: 2023-03-17 13:17:05.158778000
>     End Time: 2023-03-17 13:17:05.679796000
>     Query Type: DML
>     Sql Statement: insert into my_alltypes partition(year, month) select * from functional.alltypestiny{code}
> The "End Time" is 13:17:05. However, it's still running after that time. Showing in the coordinator logs:
> {noformat}
> I0317 13:17:05.158836 32226 impala-server.cc:1363] b64d22b7100e40ea:5345525100000000] Registered query query_id=b64d22b7100e40ea:5345525100000000 session_id=4f4b9d15a11d89b1:483361be5dda7d80
> I0317 13:17:05.159116 32226 Frontend.java:1928] b64d22b7100e40ea:5345525100000000] Analyzing query: insert into my_alltypes partition(year, month) select * from functional.alltypestiny db: default
> ...
> I0317 13:17:05.658111  1848 coordinator.cc:855] Finalizing query: b64d22b7100e40ea:5345525100000000
> I0317 13:17:05.677682  1848 coordinator.cc:899] Removing staging directory: hdfs://localhost:20500/test-warehouse/my_alltypes/_impala_insert_staging/b64d22b7100e40ea_5345525100000000/
> I0317 13:17:05.679735  1848 coordinator.cc:735] ExecState: query id=b64d22b7100e40ea:5345525100000000 execution completed
> I0317 13:17:05.679783  1848 coordinator.cc:1399] Release admission control resources for query_id=b64d22b7100e40ea:5345525100000000
> I0317 13:17:05.680047  1848 client-request-state.cc:1523] Updating metastore with 4 altered partitions (year=2009/month=1, year=2009/month=2, year=2009/month=3, year=2009/month=4)
> I0317 13:17:05.680058  1848 client-request-state.cc:1553] Executing FinalizeDml() using CatalogService
> I0317 13:17:31.990342  1848 ImpaladCatalog.java:223] Adding: TABLE:default.my_alltypes version: 1838 size: 7747
> I0317 13:17:32.309155 32226 impala-hs2-server.cc:845] CloseOperation(): query_id=b64d22b7100e40ea:5345525100000000
> I0317 13:17:32.309190 32226 impala-server.cc:1447] UnregisterQuery(): query_id=b64d22b7100e40ea:5345525100000000
> I0317 13:17:32.309198 32226 coordinator-backend-state.cc:998] query_id=b64d22b7100e40ea:5345525100000000 target backend=127.0.0.1:27002: Not cancelling because the backend is already done:
> I0317 13:17:32.309206 32226 coordinator-backend-state.cc:998] query_id=b64d22b7100e40ea:5345525100000000 target backend=127.0.0.1:27001: Not cancelling because the backend is already done:
> I0317 13:17:32.309214 32226 coordinator-backend-state.cc:998] query_id=b64d22b7100e40ea:5345525100000000 target backend=127.0.0.1:27000: Not cancelling because the backend is already done:
> I0317 13:17:32.309221 32226 coordinator.cc:1023] CancelBackends() query_id=b64d22b7100e40ea:5345525100000000, tried to cancel 0 backends
> I0317 13:17:32.312103 32105 impala-server.cc:1479] Query successfully unregistered: query_id=b64d22b7100e40ea:5345525100000000
> I0317 13:17:32.315893 32226 query-exec-mgr.cc:213] ReleaseQueryState(): deleted query_id=b64d22b7100e40ea:5345525100000000
> I0317 13:17:34.208513 32140 ImpaladCatalog.java:223] Adding: TABLE:default.my_alltypes version: 1838 size: 3860
> I0317 13:17:34.208752 32140 ImpaladCatalog.java:223] Adding: CATALOG_SERVICE_ID version: 1838 size: 60
> I0317 13:17:34.209161 32140 ImpaladCatalog.java:248] Adding 4 partition(s): HDFS_PARTITION:default.my_alltypes:(year=2009/month=1,year=2009/month=2,...,year=2009/month=4), version=1838, size=(avg=1006, min=1006, max=1006, sum=4024)
> I0317 13:17:34.210037 32140 impala-server.cc:2061] Catalog topic update applied with version: 1838 new min catalog object version: 3{noformat}
> Note that to reproduce the issue, I add a breakpoint here to manually pause updateCatalog() in catalogd: 
> [https://github.com/apache/impala/blob/8b375a66a29cfea33a4e418b1fa3b9b5139cf907/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L6791]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org