You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@griffin.apache.org by jo...@boehringer-ingelheim.com on 2019/09/19 10:50:43 UTC

Average of the measures

Hello,

We need to create an average of the measures for a certain data set, has anybody done this with Apache Griffin?

Regards

RE: Average of the measures

Posted by jo...@boehringer-ingelheim.com.
Hi William, 

Yes, we need to have a set of single measures for a dataset and then the average of this set of measures without having to recalculate each single measure, because we already have to value of the measure and we only need to calculate the average.

Regards

-----Original Message-----
From: William Guo <gu...@apache.org> 
Sent: Thursday, September 26, 2019 4:29 PM
To: dev@griffin.apache.org
Subject: Re: Average of the measures

hi,

Based on  your message,  I understand to calculate average , you want to find a way to avoid recalculate the average measure when datasets are updated.
Tell me whether my understanding is right or not?


Thanks,
William


On Wed, Sep 25, 2019 at 11:17 PM <
jose.martin_santacruz.ext@boehringer-ingelheim.com> wrote:

> Hi William,
>
> The use case is the following, we have a datalake that is structured 
> in datasets, each one of these datasets can have a set of quality 
> measures and user wants to have a global measure of the dataset 
> quality that is the average of all dataset quality measures. To do 
> this what we have done is defining a custom measure that calculates 
> this average, but it implies calculating again all quality measures 
> and we were trying to find a way of calculating the average without recalculating quality measures.
>
> Regards
>
> -----Original Message-----
> From: William Guo <gu...@apache.org>
> Sent: Tuesday, September 24, 2019 3:14 AM
> To: dev@griffin.apache.org
> Subject: Re: Average of the measures
>
> hi,
>
> Could you tell us your use case?
> Normally, you can use avg function from spark sql.
> Griffin support spark sql directly.
>
> Thanks,
> William
>
> On Thu, Sep 19, 2019 at 6:50 PM <
> jose.martin_santacruz.ext@boehringer-ingelheim.com> wrote:
>
> > Hello,
> >
> > We need to create an average of the measures for a certain data set, 
> > has anybody done this with Apache Griffin?
> >
> > Regards
> >
>

Re: Average of the measures

Posted by William Guo <gu...@apache.org>.
hi,

Based on  your message,  I understand to calculate average , you want to
find a way to avoid recalculate the average measure when datasets are
updated.
Tell me whether my understanding is right or not?


Thanks,
William


On Wed, Sep 25, 2019 at 11:17 PM <
jose.martin_santacruz.ext@boehringer-ingelheim.com> wrote:

> Hi William,
>
> The use case is the following, we have a datalake that is structured in
> datasets, each one of these datasets can have a set of quality measures and
> user wants to have a global measure of the dataset quality that is the
> average of all dataset quality measures. To do this what we have done is
> defining a custom measure that calculates this average, but it implies
> calculating again all quality measures and we were trying to find a way of
> calculating the average without recalculating quality measures.
>
> Regards
>
> -----Original Message-----
> From: William Guo <gu...@apache.org>
> Sent: Tuesday, September 24, 2019 3:14 AM
> To: dev@griffin.apache.org
> Subject: Re: Average of the measures
>
> hi,
>
> Could you tell us your use case?
> Normally, you can use avg function from spark sql.
> Griffin support spark sql directly.
>
> Thanks,
> William
>
> On Thu, Sep 19, 2019 at 6:50 PM <
> jose.martin_santacruz.ext@boehringer-ingelheim.com> wrote:
>
> > Hello,
> >
> > We need to create an average of the measures for a certain data set,
> > has anybody done this with Apache Griffin?
> >
> > Regards
> >
>

RE: Average of the measures

Posted by jo...@boehringer-ingelheim.com.
Hi William,

The use case is the following, we have a datalake that is structured in datasets, each one of these datasets can have a set of quality measures and user wants to have a global measure of the dataset quality that is the average of all dataset quality measures. To do this what we have done is defining a custom measure that calculates this average, but it implies calculating again all quality measures and we were trying to find a way of calculating the average without recalculating quality measures.

Regards

-----Original Message-----
From: William Guo <gu...@apache.org> 
Sent: Tuesday, September 24, 2019 3:14 AM
To: dev@griffin.apache.org
Subject: Re: Average of the measures

hi,

Could you tell us your use case?
Normally, you can use avg function from spark sql.
Griffin support spark sql directly.

Thanks,
William

On Thu, Sep 19, 2019 at 6:50 PM <
jose.martin_santacruz.ext@boehringer-ingelheim.com> wrote:

> Hello,
>
> We need to create an average of the measures for a certain data set, 
> has anybody done this with Apache Griffin?
>
> Regards
>

Re: Average of the measures

Posted by William Guo <gu...@apache.org>.
hi,

Could you tell us your use case?
Normally, you can use avg function from spark sql.
Griffin support spark sql directly.

Thanks,
William

On Thu, Sep 19, 2019 at 6:50 PM <
jose.martin_santacruz.ext@boehringer-ingelheim.com> wrote:

> Hello,
>
> We need to create an average of the measures for a certain data set, has
> anybody done this with Apache Griffin?
>
> Regards
>