You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Gen Luo <lu...@gmail.com> on 2022/07/21 03:12:04 UTC

Re: [DISCUSS] Replace Attempt column with Attempt Number on the subtask list page of the Web UI

Hi user mail list,

I'm also forwarding this thread to you. Please let me know if you have any
comments or feedback!

Best,
Gen

On Wed, Jul 20, 2022 at 4:25 PM Zhu Zhu <re...@gmail.com> wrote:

> Thanks for starting this discussion, Gen!
> I agree it is confusing or even troublesome to show an attempt id that is
> different from the corresponding attempt number in REST, metrics and logs.
> It adds burden to users to do the mapping in troubleshooting. Mis-mapping
> can be easy to happen and result in a waste of efforts and wrong
> conclusion.
>
> Therefore, +1 for this proposal.
>
> Thanks,
> Zhu
>
> Gen Luo <lu...@gmail.com> 于2022年7月20日周三 15:24写道:
> >
> > Hi everyone,
> >
> > I'd like to propose a change on the Web UI to replace the Attempt column
> > with an Attempt Number column on the subtask list page.
> >
> > From the very beginning, the attempt number shown is calculated at the
> > frontend by subtask.attempt + 1, which means the attempt number shown on
> > the web UI is not the same as it is in the runtime, as well as the logs
> and
> > the metrics. Users may get confused since they can't find logs or metrics
> > of the subtask with the same attempt number.
> >
> > Fortunately, by now the users don't need to care about the attempt
> number,
> > since there can be only one attempt of each subtask. However, the
> confusion
> > seems inevitable once the speculative execution[1] or the attempt history
> > is introduced, since multiple attempts of the same subtask can be
> executed
> > or presented at the same time.
> >
> > I suggest that the attempt number shown on the web UI should be changed
> to
> > align that on the runtime side, which is used in logging and metrics
> > reporting. To avoid confusion, the column should also be renamed as
> > "Attempt Number". The changes should only affect the Web UI. No REST API
> > needs to change. What do you think?
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+Execution+for+Batch+Job
> >
> > Best,
> > Gen
>