You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@griffin.apache.org by Lionel Liu <li...@apache.org> on 2018/05/25 13:41:49 UTC

Re:Profiling Job for multiple tables

Hi Karan,


I think it could work even it seems a little strange. Griffin supports multiple data sources, like accuracy. You can declare 4 data sources with different names.
However, in rules, you need to declare rules for each data source.
For example, you have data source s1, s2, s3, s4.
you need to declare rules like this:
"rules": [
  {
    "rule": "select count(*) from s1",
    ...
  },
  {
    "rule": "select count(*) from s2",
    ...
  },
  {
    "rule": "select count(*) from s3",
    ...
  },
  {
    "rule": "select count(*) from s4",
    ...
  }
]


--

Regards,
Lionel, Liu

At 2018-05-25 19:31:06, "Karan Gupta" <ka...@tavant.com> wrote:


Hi Lionel,

 

I want to run a custom profiling job for multiple tables in one instance. Is it achievable through Griffin? If yes, could you guide me as to how to declare more that 4 sources in the config file and use them in

The profiling job.

 

 

Thank you,

Karan Gupta

Any comments or statements made in this email are not necessarily those of Tavant Technologies. The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. If you have received this in error, please contact the sender and delete the material from any computer. All emails sent from or to Tavant Technologies may be subject to our monitoring procedures.

Re: Re:Profiling Job for multiple tables

Posted by Lionel Liu <li...@apache.org>.
Hi Karan,

For your questions, here lists my opinions:


   1. > Instead of defining ‘n’ number of profiling rules for ‘n’ number of
   data sources, can I make my rule parameterized?

For example. -> “rule”: “select count(*) from ${source}”


Griffin doesn't parse "spark-sql" rule, the parameter could not be
recognized by spark.

I think it would be an interesting function for griffin, maybe we can
support this int the later version.



   1. > Can I invoke my profiling rule through a web API?


Griffin UI only supports several kinds of profiling measure, but you can
add new measure with self-defined rule through the API.

Thanks,
Lionel

On Thu, May 31, 2018 at 3:01 PM, Karan Gupta <ka...@tavant.com> wrote:

> Hi Lionel,
>
>
>
> I tried the below solution and it is working fine. I have a couple of
> questions
>
>
>
>    1. > Instead of defining ‘n’ number of profiling rules for ‘n’ number
>    of data sources, can I make my rule parameterized?
>
> For example. -> “rule”: “select count(*) from ${source}”
>
>    1. > Can I invoke my profiling rule through a web API?
>
>
>
> Thank you,
>
> Karan Gupta
>
>
>
> *From:* bhlx3lyx7@163.com <bh...@163.com> *On Behalf Of *Lionel Liu
> *Sent:* Friday, May 25, 2018 7:12 PM
> *To:* Karan Gupta <ka...@tavant.com>; dev@griffin.incubator.apache.
> org
> *Subject:* Re:Profiling Job for multiple tables
>
>
>
> Hi Karan,
>
>
>
> I think it could work even it seems a little strange. Griffin supports
> multiple data sources, like accuracy. You can declare 4 data sources with
> different names.
>
> However, in rules, you need to declare rules for each data source.
>
> For example, you have data source s1, s2, s3, s4.
>
> you need to declare rules like this:
>
> "rules": [
>
>   {
>
>     "rule": "select count(*) from s1",
>
>     ...
>
>   },
>
>   {
>
>     "rule": "select count(*) from s2",
>
>     ...
>
>   },
>
>   {
>
>     "rule": "select count(*) from s3",
>
>     ...
>
>   },
>
>   {
>
>     "rule": "select count(*) from s4",
>
>     ...
>
>   }
>
> ]
>
>
>
> --
>
> Regards,
>
> Lionel, Liu
>
>
> At 2018-05-25 19:31:06, "Karan Gupta" <ka...@tavant.com> wrote:
>
> Hi Lionel,
>
>
>
> I want to run a custom profiling job for multiple tables in one instance.
> Is it achievable through Griffin? If yes, could you guide me as to how to
> declare more that 4 sources in the config file and use them in
>
> The profiling job.
>
>
>
>
>
> Thank you,
>
> Karan Gupta
> ------------------------------
>
> Any comments or statements made in this email are not necessarily those of
> Tavant Technologies. The information transmitted is intended only for the
> person or entity to which it is addressed and may contain confidential
> and/or privileged material. If you have received this in error, please
> contact the sender and delete the material from any computer. All emails
> sent from or to Tavant Technologies may be subject to our monitoring
> procedures.
>
>
>
>
>

RE: Re:Profiling Job for multiple tables

Posted by Karan Gupta <ka...@tavant.com>.
Hi Lionel,

I tried the below solution and it is working fine. I have a couple of questions


  1.  > Instead of defining 'n' number of profiling rules for 'n' number of data sources, can I make my rule parameterized?

For example. -> "rule": "select count(*) from ${source}"

  1.  > Can I invoke my profiling rule through a web API?

Thank you,
Karan Gupta

From: bhlx3lyx7@163.com <bh...@163.com> On Behalf Of Lionel Liu
Sent: Friday, May 25, 2018 7:12 PM
To: Karan Gupta <ka...@tavant.com>; dev@griffin.incubator.apache.org
Subject: Re:Profiling Job for multiple tables

Hi Karan,

I think it could work even it seems a little strange. Griffin supports multiple data sources, like accuracy. You can declare 4 data sources with different names.
However, in rules, you need to declare rules for each data source.
For example, you have data source s1, s2, s3, s4.
you need to declare rules like this:
"rules": [
  {
    "rule": "select count(*) from s1",
    ...
  },
  {
    "rule": "select count(*) from s2",
    ...
  },
  {
    "rule": "select count(*) from s3",
    ...
  },
  {
    "rule": "select count(*) from s4",
    ...
  }
]

--
Regards,
Lionel, Liu

At 2018-05-25 19:31:06, "Karan Gupta" <ka...@tavant.com>> wrote:

Hi Lionel,

I want to run a custom profiling job for multiple tables in one instance. Is it achievable through Griffin? If yes, could you guide me as to how to declare more that 4 sources in the config file and use them in
The profiling job.


Thank you,
Karan Gupta
________________________________
Any comments or statements made in this email are not necessarily those of Tavant Technologies. The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. If you have received this in error, please contact the sender and delete the material from any computer. All emails sent from or to Tavant Technologies may be subject to our monitoring procedures.