You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Darshan Singh <da...@gmail.com> on 2018/04/13 12:37:03 UTC
Any metrics to get the shuffled and intermediate data in flink
Hi
Is there any useful metrics in flink which tells me that a given operator
read say 1 GB of data and shuffled(or anything else) and written(in case it
was written to temp or anywhere else) say 1 or 2 GB data.
One of my job is failing with disk space and there are many sort, group and
join is happening and I would want to know which one is generating most of
the temp space.
Thanks
Re: Any metrics to get the shuffled and intermediate data in flink
Posted by Darshan Singh <da...@gmail.com>.
Thanks, I could see those on UI.
Thanks
On Fri, Apr 13, 2018 at 3:12 PM, TechnoMage <ml...@technomage.com> wrote:
> If you look at the web UI for flink it will tell you the bytes received
> and sent for each stage of a job. I have not seen any similar metric for
> persisted state per stage, which would be nice to have as well.
>
> Michael
>
> > On Apr 13, 2018, at 6:37 AM, Darshan Singh <da...@gmail.com>
> wrote:
> >
> > Hi
> >
> > Is there any useful metrics in flink which tells me that a given
> operator read say 1 GB of data and shuffled(or anything else) and
> written(in case it was written to temp or anywhere else) say 1 or 2 GB data.
> >
> > One of my job is failing with disk space and there are many sort, group
> and join is happening and I would want to know which one is generating most
> of the temp space.
> >
> >
> > Thanks
>
>
Re: Any metrics to get the shuffled and intermediate data in flink
Posted by TechnoMage <ml...@technomage.com>.
If you look at the web UI for flink it will tell you the bytes received and sent for each stage of a job. I have not seen any similar metric for persisted state per stage, which would be nice to have as well.
Michael
> On Apr 13, 2018, at 6:37 AM, Darshan Singh <da...@gmail.com> wrote:
>
> Hi
>
> Is there any useful metrics in flink which tells me that a given operator read say 1 GB of data and shuffled(or anything else) and written(in case it was written to temp or anywhere else) say 1 or 2 GB data.
>
> One of my job is failing with disk space and there are many sort, group and join is happening and I would want to know which one is generating most of the temp space.
>
>
> Thanks