You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sedona.apache.org by Zachary Paden <za...@locallogic.co> on 2023/02/15 17:48:46 UTC

[feature-request] ST_GEOHASH_COVER and ST_GEOHASH_COMPRESSION

Hello all,

As a sedona user, I'd like to be able to create covers and compress
geohashes. These are features of some IBM libraries
<https://www.ibm.com/docs/ja/db2-warehouse?topic=concepts-geohashes-geohash-covers>.


Essentially I'd like to add something akin to `ST_GEOHASH_COVER` function
and perhaps a `ST_GEOHASH_COMPRESSION` function. This surprisingly doesn't
have much support in OSS libraries as far as I can tell, but implementation
on some level should be doable IMO.

This would allow users to use geohash as another way to index their data,
one that is more compatible with other query engines (EG: Anything that
supports can support geohashes on some level)

Re: [feature-request] ST_GEOHASH_COVER and ST_GEOHASH_COMPRESSION

Posted by Jia Yu <ji...@apache.org>.
Hi George,

H3 implementation is on our agenda. We will implement H3 once S2 is
finished.

Thanks,
Jia

On Fri, Feb 17, 2023 at 9:27 AM George Percivall <pe...@ieee.org> wrote:

> Jia,
>
> Is there an H3 implementation in Sedona?
>
> https://h3geo.org/
> https://github.com/uber/h3
>
> George Percivall
> GeoRoundtable.xyz <https://www.linkedin.com/company/geo-roundtable/>
> percivall@ieee.org
>
> On Feb 17, 2023, at 12:16 PM, Jia Yu <ji...@apache.org> wrote:
>
> Hi,
>
> 1. I think a single function st_geohashcover(geom, precision: optional) is
> good. If precision is not given, we generate geohash to a hard coded
> precision.
>
> A very common use case of GeoHash is to do spatial join. This requires that
> all geoms involved in the join follow the same geohash precision. We
> shouldn't allow the function itself to figure this out.
>
> 2. The produced GeoHash codes, if stitched together, should fully cover a
> geometry.
>
> 3. It is ok to have ST_GeoHash_Compression but I would give it lower
> priority.
>
> 4. I think st_geohashcover will take some effort to implement. Because you
> cannot just take the 4 corner coordinate of a MBR as the result. You need
> to find all internal geohash code as well.
>
> 5. A S2 implementation that can be used in spatial join is being
> implemented in Sedona. It is mostly done. You can take a look:
> https://github.com/apache/sedona/pull/764
>
>
> Thanks,
> Jia
>
> On Wed, Feb 15, 2023 at 9:15 PM Zachary Paden <za...@locallogic.co> wrote:
>
> Yes, some basic functions already exist :)
>
> After thinking about it some more, I'm wondering if it may be better to
> implement something like:
> - st_geohash_envelope_cover(envelope)->List<Geohash String>
> - st_geohash_contained_geohashes_at_depth(Geohash, int precision) ->
> list<Geohash str>
>
> i believe for cover the algorithm i plan on implementing will essentially
> be:
>
> 1. given an envelope
> 2. From envelope size, derive the correct geohash scale such that worse
> case scenario each corner is in a distinct geohash.
> 3. return a list of the geohashes
>
> the above is O(1).
>
> if we then add functions to create containing geohashes (eg: for geohash DR
> give all three digits contained by DR.. {DR1, DR2...} we can let users use
> other functionality to do whatever they'd like. but at the same time, once
> we implement the first few foundational functions the rest are much easier.
>
> for example:
> - Users may want to compact or not compact geohashes
> - may wish to use INTERSECTS instead of COVERS, or any other predicate.
> - may wish to increase or decrease resolution of the returned geohashes to
> have a more or less accurate version
>
> This is also somewhat easy to do, but I'm not a java developer so it may
> take me awhile. If you or anyone else has any thoughts of what actual
> functions should be implemented I'd also be happy to hear them :)
>
> On Wed, Feb 15, 2023, 10:54 PM Jia Yu <ji...@apache.org> wrote:
>
> BTW, the basic GeoHash function is already in Sedona. I believe this
> function just need to extend those basic GeoHash funcs.
>
> On Wed, Feb 15, 2023 at 5:59 PM Jia Yu <ji...@apache.org> wrote:
>
>
>
> On Wed, Feb 15, 2023 at 5:59 PM Jia Yu <ji...@apache.org> wrote:
>
> This is a fantastic idea. Would you please add it to Sedona? This will
> definitely benefit many Sedona users!
>
> Jia
>
> On Wed, Feb 15, 2023 at 9:49 AM Zachary Paden <za...@locallogic.co>
> wrote:
>
> Hello all,
>
> As a sedona user, I'd like to be able to create covers and compress
> geohashes. These are features of some IBM libraries
> <
>
>
> https://www.ibm.com/docs/ja/db2-warehouse?topic=concepts-geohashes-geohash-covers
>
> .
>
>
>
> Essentially I'd like to add something akin to `ST_GEOHASH_COVER`
> function
> and perhaps a `ST_GEOHASH_COMPRESSION` function. This surprisingly
> doesn't
> have much support in OSS libraries as far as I can tell, but
> implementation
> on some level should be doable IMO.
>
> This would allow users to use geohash as another way to index their
> data,
> one that is more compatible with other query engines (EG: Anything
>
> that
>
> supports can support geohashes on some level)
>
>
>
>
>

Re: [feature-request] ST_GEOHASH_COVER and ST_GEOHASH_COMPRESSION

Posted by George Percivall <pe...@ieee.org.INVALID>.
Jia,

Is there an H3 implementation in Sedona?

https://h3geo.org/
https://github.com/uber/h3

George Percivall
GeoRoundtable.xyz <https://www.linkedin.com/company/geo-roundtable/>
percivall@ieee.org <ma...@ieee.org> 

> On Feb 17, 2023, at 12:16 PM, Jia Yu <ji...@apache.org> wrote:
> 
> Hi,
> 
> 1. I think a single function st_geohashcover(geom, precision: optional) is
> good. If precision is not given, we generate geohash to a hard coded
> precision.
> 
> A very common use case of GeoHash is to do spatial join. This requires that
> all geoms involved in the join follow the same geohash precision. We
> shouldn't allow the function itself to figure this out.
> 
> 2. The produced GeoHash codes, if stitched together, should fully cover a
> geometry.
> 
> 3. It is ok to have ST_GeoHash_Compression but I would give it lower
> priority.
> 
> 4. I think st_geohashcover will take some effort to implement. Because you
> cannot just take the 4 corner coordinate of a MBR as the result. You need
> to find all internal geohash code as well.
> 
> 5. A S2 implementation that can be used in spatial join is being
> implemented in Sedona. It is mostly done. You can take a look:
> https://github.com/apache/sedona/pull/764
> 
> 
> Thanks,
> Jia
> 
> On Wed, Feb 15, 2023 at 9:15 PM Zachary Paden <za...@locallogic.co> wrote:
> 
>> Yes, some basic functions already exist :)
>> 
>> After thinking about it some more, I'm wondering if it may be better to
>> implement something like:
>> - st_geohash_envelope_cover(envelope)->List<Geohash String>
>> - st_geohash_contained_geohashes_at_depth(Geohash, int precision) ->
>> list<Geohash str>
>> 
>> i believe for cover the algorithm i plan on implementing will essentially
>> be:
>> 
>> 1. given an envelope
>> 2. From envelope size, derive the correct geohash scale such that worse
>> case scenario each corner is in a distinct geohash.
>> 3. return a list of the geohashes
>> 
>> the above is O(1).
>> 
>> if we then add functions to create containing geohashes (eg: for geohash DR
>> give all three digits contained by DR.. {DR1, DR2...} we can let users use
>> other functionality to do whatever they'd like. but at the same time, once
>> we implement the first few foundational functions the rest are much easier.
>> 
>> for example:
>> - Users may want to compact or not compact geohashes
>> - may wish to use INTERSECTS instead of COVERS, or any other predicate.
>> - may wish to increase or decrease resolution of the returned geohashes to
>> have a more or less accurate version
>> 
>> This is also somewhat easy to do, but I'm not a java developer so it may
>> take me awhile. If you or anyone else has any thoughts of what actual
>> functions should be implemented I'd also be happy to hear them :)
>> 
>> On Wed, Feb 15, 2023, 10:54 PM Jia Yu <ji...@apache.org> wrote:
>> 
>>> BTW, the basic GeoHash function is already in Sedona. I believe this
>>> function just need to extend those basic GeoHash funcs.
>>> 
>>> On Wed, Feb 15, 2023 at 5:59 PM Jia Yu <ji...@apache.org> wrote:
>>> 
>>>> 
>>>> 
>>>> On Wed, Feb 15, 2023 at 5:59 PM Jia Yu <ji...@apache.org> wrote:
>>>> 
>>>>> This is a fantastic idea. Would you please add it to Sedona? This will
>>>>> definitely benefit many Sedona users!
>>>>> 
>>>>> Jia
>>>>> 
>>>>> On Wed, Feb 15, 2023 at 9:49 AM Zachary Paden <za...@locallogic.co>
>>>>> wrote:
>>>>> 
>>>>>> Hello all,
>>>>>> 
>>>>>> As a sedona user, I'd like to be able to create covers and compress
>>>>>> geohashes. These are features of some IBM libraries
>>>>>> <
>>>>>> 
>> https://www.ibm.com/docs/ja/db2-warehouse?topic=concepts-geohashes-geohash-covers
>>>>>>> .
>>>>>> 
>>>>>> 
>>>>>> Essentially I'd like to add something akin to `ST_GEOHASH_COVER`
>>>>>> function
>>>>>> and perhaps a `ST_GEOHASH_COMPRESSION` function. This surprisingly
>>>>>> doesn't
>>>>>> have much support in OSS libraries as far as I can tell, but
>>>>>> implementation
>>>>>> on some level should be doable IMO.
>>>>>> 
>>>>>> This would allow users to use geohash as another way to index their
>>>>>> data,
>>>>>> one that is more compatible with other query engines (EG: Anything
>> that
>>>>>> supports can support geohashes on some level)
>>>>>> 
>>>>> 
>> 


Re: [feature-request] ST_GEOHASH_COVER and ST_GEOHASH_COMPRESSION

Posted by Jia Yu <ji...@apache.org>.
Hi,

1. I think a single function st_geohashcover(geom, precision: optional) is
good. If precision is not given, we generate geohash to a hard coded
precision.

A very common use case of GeoHash is to do spatial join. This requires that
all geoms involved in the join follow the same geohash precision. We
shouldn't allow the function itself to figure this out.

2. The produced GeoHash codes, if stitched together, should fully cover a
geometry.

3. It is ok to have ST_GeoHash_Compression but I would give it lower
priority.

4. I think st_geohashcover will take some effort to implement. Because you
cannot just take the 4 corner coordinate of a MBR as the result. You need
to find all internal geohash code as well.

5. A S2 implementation that can be used in spatial join is being
implemented in Sedona. It is mostly done. You can take a look:
https://github.com/apache/sedona/pull/764


Thanks,
Jia

On Wed, Feb 15, 2023 at 9:15 PM Zachary Paden <za...@locallogic.co> wrote:

> Yes, some basic functions already exist :)
>
> After thinking about it some more, I'm wondering if it may be better to
> implement something like:
> - st_geohash_envelope_cover(envelope)->List<Geohash String>
> - st_geohash_contained_geohashes_at_depth(Geohash, int precision) ->
> list<Geohash str>
>
> i believe for cover the algorithm i plan on implementing will essentially
> be:
>
> 1. given an envelope
> 2. From envelope size, derive the correct geohash scale such that worse
> case scenario each corner is in a distinct geohash.
> 3. return a list of the geohashes
>
> the above is O(1).
>
> if we then add functions to create containing geohashes (eg: for geohash DR
> give all three digits contained by DR.. {DR1, DR2...} we can let users use
> other functionality to do whatever they'd like. but at the same time, once
> we implement the first few foundational functions the rest are much easier.
>
> for example:
> - Users may want to compact or not compact geohashes
> - may wish to use INTERSECTS instead of COVERS, or any other predicate.
> - may wish to increase or decrease resolution of the returned geohashes to
> have a more or less accurate version
>
> This is also somewhat easy to do, but I'm not a java developer so it may
> take me awhile. If you or anyone else has any thoughts of what actual
> functions should be implemented I'd also be happy to hear them :)
>
> On Wed, Feb 15, 2023, 10:54 PM Jia Yu <ji...@apache.org> wrote:
>
> > BTW, the basic GeoHash function is already in Sedona. I believe this
> > function just need to extend those basic GeoHash funcs.
> >
> > On Wed, Feb 15, 2023 at 5:59 PM Jia Yu <ji...@apache.org> wrote:
> >
> >>
> >>
> >> On Wed, Feb 15, 2023 at 5:59 PM Jia Yu <ji...@apache.org> wrote:
> >>
> >>> This is a fantastic idea. Would you please add it to Sedona? This will
> >>> definitely benefit many Sedona users!
> >>>
> >>> Jia
> >>>
> >>> On Wed, Feb 15, 2023 at 9:49 AM Zachary Paden <za...@locallogic.co>
> >>> wrote:
> >>>
> >>>> Hello all,
> >>>>
> >>>> As a sedona user, I'd like to be able to create covers and compress
> >>>> geohashes. These are features of some IBM libraries
> >>>> <
> >>>>
> https://www.ibm.com/docs/ja/db2-warehouse?topic=concepts-geohashes-geohash-covers
> >>>> >.
> >>>>
> >>>>
> >>>> Essentially I'd like to add something akin to `ST_GEOHASH_COVER`
> >>>> function
> >>>> and perhaps a `ST_GEOHASH_COMPRESSION` function. This surprisingly
> >>>> doesn't
> >>>> have much support in OSS libraries as far as I can tell, but
> >>>> implementation
> >>>> on some level should be doable IMO.
> >>>>
> >>>> This would allow users to use geohash as another way to index their
> >>>> data,
> >>>> one that is more compatible with other query engines (EG: Anything
> that
> >>>> supports can support geohashes on some level)
> >>>>
> >>>
>

Re: [feature-request] ST_GEOHASH_COVER and ST_GEOHASH_COMPRESSION

Posted by Zachary Paden <za...@locallogic.co>.
Yes, some basic functions already exist :)

After thinking about it some more, I'm wondering if it may be better to
implement something like:
- st_geohash_envelope_cover(envelope)->List<Geohash String>
- st_geohash_contained_geohashes_at_depth(Geohash, int precision) ->
list<Geohash str>

i believe for cover the algorithm i plan on implementing will essentially
be:

1. given an envelope
2. From envelope size, derive the correct geohash scale such that worse
case scenario each corner is in a distinct geohash.
3. return a list of the geohashes

the above is O(1).

if we then add functions to create containing geohashes (eg: for geohash DR
give all three digits contained by DR.. {DR1, DR2...} we can let users use
other functionality to do whatever they'd like. but at the same time, once
we implement the first few foundational functions the rest are much easier.

for example:
- Users may want to compact or not compact geohashes
- may wish to use INTERSECTS instead of COVERS, or any other predicate.
- may wish to increase or decrease resolution of the returned geohashes to
have a more or less accurate version

This is also somewhat easy to do, but I'm not a java developer so it may
take me awhile. If you or anyone else has any thoughts of what actual
functions should be implemented I'd also be happy to hear them :)

On Wed, Feb 15, 2023, 10:54 PM Jia Yu <ji...@apache.org> wrote:

> BTW, the basic GeoHash function is already in Sedona. I believe this
> function just need to extend those basic GeoHash funcs.
>
> On Wed, Feb 15, 2023 at 5:59 PM Jia Yu <ji...@apache.org> wrote:
>
>>
>>
>> On Wed, Feb 15, 2023 at 5:59 PM Jia Yu <ji...@apache.org> wrote:
>>
>>> This is a fantastic idea. Would you please add it to Sedona? This will
>>> definitely benefit many Sedona users!
>>>
>>> Jia
>>>
>>> On Wed, Feb 15, 2023 at 9:49 AM Zachary Paden <za...@locallogic.co>
>>> wrote:
>>>
>>>> Hello all,
>>>>
>>>> As a sedona user, I'd like to be able to create covers and compress
>>>> geohashes. These are features of some IBM libraries
>>>> <
>>>> https://www.ibm.com/docs/ja/db2-warehouse?topic=concepts-geohashes-geohash-covers
>>>> >.
>>>>
>>>>
>>>> Essentially I'd like to add something akin to `ST_GEOHASH_COVER`
>>>> function
>>>> and perhaps a `ST_GEOHASH_COMPRESSION` function. This surprisingly
>>>> doesn't
>>>> have much support in OSS libraries as far as I can tell, but
>>>> implementation
>>>> on some level should be doable IMO.
>>>>
>>>> This would allow users to use geohash as another way to index their
>>>> data,
>>>> one that is more compatible with other query engines (EG: Anything that
>>>> supports can support geohashes on some level)
>>>>
>>>

Re: [feature-request] ST_GEOHASH_COVER and ST_GEOHASH_COMPRESSION

Posted by Jia Yu <ji...@apache.org>.
BTW, the basic GeoHash function is already in Sedona. I believe this
function just need to extend those basic GeoHash funcs.

On Wed, Feb 15, 2023 at 5:59 PM Jia Yu <ji...@apache.org> wrote:

>
>
> On Wed, Feb 15, 2023 at 5:59 PM Jia Yu <ji...@apache.org> wrote:
>
>> This is a fantastic idea. Would you please add it to Sedona? This will
>> definitely benefit many Sedona users!
>>
>> Jia
>>
>> On Wed, Feb 15, 2023 at 9:49 AM Zachary Paden <za...@locallogic.co> wrote:
>>
>>> Hello all,
>>>
>>> As a sedona user, I'd like to be able to create covers and compress
>>> geohashes. These are features of some IBM libraries
>>> <
>>> https://www.ibm.com/docs/ja/db2-warehouse?topic=concepts-geohashes-geohash-covers
>>> >.
>>>
>>>
>>> Essentially I'd like to add something akin to `ST_GEOHASH_COVER` function
>>> and perhaps a `ST_GEOHASH_COMPRESSION` function. This surprisingly
>>> doesn't
>>> have much support in OSS libraries as far as I can tell, but
>>> implementation
>>> on some level should be doable IMO.
>>>
>>> This would allow users to use geohash as another way to index their data,
>>> one that is more compatible with other query engines (EG: Anything that
>>> supports can support geohashes on some level)
>>>
>>

Re: [feature-request] ST_GEOHASH_COVER and ST_GEOHASH_COMPRESSION

Posted by Jia Yu <ji...@apache.org>.
On Wed, Feb 15, 2023 at 5:59 PM Jia Yu <ji...@apache.org> wrote:

> This is a fantastic idea. Would you please add it to Sedona? This will
> definitely benefit many Sedona users!
>
> Jia
>
> On Wed, Feb 15, 2023 at 9:49 AM Zachary Paden <za...@locallogic.co> wrote:
>
>> Hello all,
>>
>> As a sedona user, I'd like to be able to create covers and compress
>> geohashes. These are features of some IBM libraries
>> <
>> https://www.ibm.com/docs/ja/db2-warehouse?topic=concepts-geohashes-geohash-covers
>> >.
>>
>>
>> Essentially I'd like to add something akin to `ST_GEOHASH_COVER` function
>> and perhaps a `ST_GEOHASH_COMPRESSION` function. This surprisingly doesn't
>> have much support in OSS libraries as far as I can tell, but
>> implementation
>> on some level should be doable IMO.
>>
>> This would allow users to use geohash as another way to index their data,
>> one that is more compatible with other query engines (EG: Anything that
>> supports can support geohashes on some level)
>>
>

Re: [feature-request] ST_GEOHASH_COVER and ST_GEOHASH_COMPRESSION

Posted by Jia Yu <ji...@apache.org>.
This is a fantastic idea. Would you please add it to Sedona? This will
definitely benefit many Sedona users!

Jia

On Wed, Feb 15, 2023 at 9:49 AM Zachary Paden <za...@locallogic.co> wrote:

> Hello all,
>
> As a sedona user, I'd like to be able to create covers and compress
> geohashes. These are features of some IBM libraries
> <
> https://www.ibm.com/docs/ja/db2-warehouse?topic=concepts-geohashes-geohash-covers
> >.
>
>
> Essentially I'd like to add something akin to `ST_GEOHASH_COVER` function
> and perhaps a `ST_GEOHASH_COMPRESSION` function. This surprisingly doesn't
> have much support in OSS libraries as far as I can tell, but implementation
> on some level should be doable IMO.
>
> This would allow users to use geohash as another way to index their data,
> one that is more compatible with other query engines (EG: Anything that
> supports can support geohashes on some level)
>