You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sedona.apache.org by "Adam Binford (Jira)" <ji...@apache.org> on 2021/02/26 19:19:00 UTC

[jira] [Created] (SEDONA-19) Global indexing does not work with SQL joins

Adam Binford created SEDONA-19:
----------------------------------

             Summary: Global indexing does not work with SQL joins
                 Key: SEDONA-19
                 URL: https://issues.apache.org/jira/browse/SEDONA-19
             Project: Apache Sedona
          Issue Type: Bug
            Reporter: Adam Binford


According to the documentation, global indexing can be used with SQL joins and is enabled by default. But the code path here [https://github.com/apache/incubator-sedona/blob/master/sql/src/main/scala/org/apache/spark/sql/sedona_sql/strategy/join/TraitJoinQueryExec.scala#L125] calls the JoinParams constructor here [https://github.com/apache/incubator-sedona/blob/master/core/src/main/java/org/apache/sedona/core/spatialOperator/JoinQuery.java#L434] which always sets useIndex to false. This prevents indexing from being possible via SQL queries, and the non-indexed join doesn't work well with large datasets (separate issue, loads all of both left and right sides of partition into memory at once and quickly runs out of memory)

Also, this python adapter uses the arguments incorrectly as well here: [https://github.com/apache/incubator-sedona/blob/master/python-adapter/src/main/scala/org.apache.sedona.python.wrapper/adapters/JoinParamsAdapter.scala#L29]

Need to update the signature of JoinParams, probably just add one with all four parameters



--
This message was sent by Atlassian Jira
(v8.3.4#803005)