You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by Indhumathi <in...@gmail.com> on 2021/04/15 16:07:29 UTC

Re: [DISCUSSION] Support JOIN query with spatial index

Hello all,

Please find the design document link attached in JIRA,  CARBONDATA-4166
<https://issues.apache.org/jira/browse/CARBONDATA-4166>  
Any inputs/suggestions from the community is most welcomed.

Regards,
Indhumathi



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [DISCUSSION] Support JOIN query with spatial index

Posted by Ajantha Bhat <aj...@gmail.com>.
ok.
+1 from my side.

If polygon join query still has performance bottleneck, we can later
optimize it.

Thanks,
Ajantha

On Tue, Apr 27, 2021 at 3:59 PM Indhumathi <in...@gmail.com> wrote:

> Thanks Ajantha for your inputs.
>
> I have modified the design, by adding ToRangeList Udf filter as a implicit
> column projection to the polygon table dataframe and modified the JOIN
> condition with range list udf column, in order to improve performance.
>
> By this way, we can avoid making quadtree from N*M times to M times.
> I have attached new design document in the JIRA.
> CARBONDATA-4166 <https://issues.apache.org/jira/browse/CARBONDATA-4166>
>
> Regards,
> Indhumathi
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>

Re: [DISCUSSION] Support JOIN query with spatial index

Posted by David CaiQiang <da...@gmail.com>.
+1



-----
Best Regards
David Cai
--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [DISCUSSION] Support JOIN query with spatial index

Posted by Indhumathi <in...@gmail.com>.
Thanks Ajantha for your inputs.

I have modified the design, by adding ToRangeList Udf filter as a implicit 
column projection to the polygon table dataframe and modified the JOIN 
condition with range list udf column, in order to improve performance.

By this way, we can avoid making quadtree from N*M times to M times.
I have attached new design document in the JIRA.
CARBONDATA-4166 <https://issues.apache.org/jira/browse/CARBONDATA-4166>  

Regards,
Indhumathi



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [DISCUSSION] Support JOIN query with spatial index

Posted by Ajantha Bhat <aj...@gmail.com>.
Hi,
I think now the latest document has addressed my previous comments and
questions.

polygon list query and polyline list query design looks ok.

But the design of polygon query with join, I have performance concern.
In this approach, we are using union polygon filter on spatial_table to
prune till blocklet.
It may identify all the rows in blocklet in the worst case and with this
output (N) we will perform join with the polygon table output(M).
which will again check IN_POLYGON condition during join (N*M) times. I too
don't have any different solution at the moment.

But we can optimize the current solution further by below points:
a) Here for the polygon table output you can reduce making quadtree for N*M
times to M times and use the quadtree output as range filter/UDF for join.
b) Also later if we need more improvement, maybe we can try row-level
filtering on the spatial table.

Thanks,
Ajantha



On Thu, Apr 15, 2021 at 9:37 PM Indhumathi <in...@gmail.com> wrote:

> Hello all,
>
> Please find the design document link attached in JIRA,  CARBONDATA-4166
> <https://issues.apache.org/jira/browse/CARBONDATA-4166>
> Any inputs/suggestions from the community is most welcomed.
>
> Regards,
> Indhumathi
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>