You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2018/11/03 00:16:00 UTC

[jira] [Resolved] (IMPALA-4092) Log the remote fragment error message on coordinator too

     [ https://issues.apache.org/jira/browse/IMPALA-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Armstrong resolved IMPALA-4092.
-----------------------------------
    Resolution: Fixed

This should work now with the various query lifecycle improvments
{noformat}
I1102 17:03:38.872902 14702 coordinator.cc:498] ExecState: query id=624fcdf334c50adb:4364c01a00000000 finstance=624fcdf334c50adb:4364c01a00000005 on host=tarmstrong-box:22002 (EXECUTING -> ERROR) status=Could not create files in any configured scratch directories (--scratch_dirs=/tmp/impala-scratch) on backend 'tarmstrong-box:22002'. See logs for previous errors that may have prevented creating or writing scratch files.
{noformat}

> Log the remote fragment error message on coordinator too
> --------------------------------------------------------
>
>                 Key: IMPALA-4092
>                 URL: https://issues.apache.org/jira/browse/IMPALA-4092
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Distributed Exec
>    Affects Versions: Impala 2.2
>            Reporter: Miklos Szurap
>            Priority: Major
>              Labels: supportability
>
> When a remote fragment fails, in the coordinator logs only the Cancel() message can be seen, not the real error message:
> {noformat}
> I0906 07:36:36.415954 13332 coordinator.cc:386] starting 2 backends for query 634caf4174771cbc:fe4a29
> d380c2ddbe
> I0906 07:36:36.775049 23369 plan-fragment-executor.cc:300] Open(): instance_id=634caf4174771cbc:fe4a29d380c2ddbf
> I0906 07:36:37.723321 20510 coordinator.cc:1316] Backend 1 completed, 1 remaining: query_id=634caf4174771cbc:fe4a29d380c2ddbe
> I0906 07:36:37.723358 20510 coordinator.cc:1325] query_id=634caf4174771cbc:fe4a29d380c2ddbe: first in-progress backend: remote-impalad-host:22000
> I0906 07:36:37.734156 22711 coordinator.cc:1131] Cancel() query_id=634caf4174771cbc:fe4a29d380c2ddbe
> {noformat}
> Of course the real error message is logged on the remote (backend) host, for example:
> {noformat}
> I0906 07:36:37.732405 15306 runtime-state.cc:230] Error from query 634caf4174771cbc:fe4a29d380c2ddbe: Create file /impala/impalad/impala-scratch/634caf4174771cbc:fe4a29d380c2ddbe_663a9199-63b3-4d5e-be84-297e05a99970 failed with errno=2 description=Error(2): No such file or directory
> {noformat}
> The error message is recorded in the query profile, but it is hard to track - trace - monitor why the query failed actually.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)