You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jason Heo <ja...@gmail.com> on 2018/01/03 04:17:15 UTC
[Spark SQL] How to run a custom meta query for `ANALYZE TABLE`
Hi,
I'm working on integrating Spark and a custom data source.
Most things go well with nice Spark Data Source APIs (Thanks to well
designed APIs)
But, one thing I couldn't resolve is that how to execute custom meta query
for `ANALYZE TABLE`
The custom data source I'm currently working on has a meta query so we can
get MIN/MAX/Cardinality without full scan.
What I want to do is that when `ANALYZE TABLE` is executed over the custom
data source then execute custom meta query rather than executing Full
Scanning.
If this is not possible, I'm considering inserting stats into metastore_db
manually. Is there any API exposed to handle metastore_db (e.g.
insert/delete meta db)?
Regards,
Jason
Re: [Spark SQL] How to run a custom meta query for `ANALYZE TABLE`
Posted by Jörn Franke <jo...@gmail.com>.
Hi,
No this is not possible with the current data source API. However, there is a new data source API v2 on its way - maybe it will support it.
Alternatively, you can have a config option to calculate meta data after an insert.
However, could you please explain more for which dB your datasource is and when this meta query should be executed ?
> On 3. Jan 2018, at 05:17, Jason Heo <ja...@gmail.com> wrote:
>
> Hi,
>
> I'm working on integrating Spark and a custom data source.
>
> Most things go well with nice Spark Data Source APIs (Thanks to well designed APIs)
>
> But, one thing I couldn't resolve is that how to execute custom meta query for `ANALYZE TABLE`
>
> The custom data source I'm currently working on has a meta query so we can get MIN/MAX/Cardinality without full scan.
>
> What I want to do is that when `ANALYZE TABLE` is executed over the custom data source then execute custom meta query rather than executing Full Scanning.
>
> If this is not possible, I'm considering inserting stats into metastore_db manually. Is there any API exposed to handle metastore_db (e.g. insert/delete meta db)?
>
> Regards,
>
> Jason
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org