You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Rob Vesse <rv...@yarcdata.com> on 2012/09/12 23:42:08 UTC

Spatial Indexing library for GeoSPARQL

I remember some discussions a while back about one of the barriers to implementing GeoSPARQL in Jena being the lack of a good indexing library to use

I notice that Lucene 4.0 has a new Spatial module - http://lucene.apache.org/core/4_0_0-BETA/spatial/index.html – which is itself built on another library Spatial4j which is ASL licensed

Would these be sufficient pieces to get us started?  I haven't looked in detail as to whether these libraries provide the specific geospatial primitives and functions we'd need to implement GeoSPARQL

Rob

Re: Spatial Indexing library for GeoSPARQL

Posted by Paolo Castagna <ca...@gmail.com>.
Hi Rob
(a bit out of the loop at the moment, however) do you know if there is
any GeoSPARQL open source implementation around we could look at?

One useful thing to do would be go to through the GeoSPARQL spec and
make a list of the FILTERs which need to be implemented.
What is not "trivial" IMHO is understand how the optimizer will
handle/transform/implement those FILTERs. Maybe an approach a la LARQ
(i.e. do the geo bit first) should be the first thing to do.

Marco said in a reply: "the task should not be underestimated"

+1

However, I do believe in the magic of collaboration around open source
software and maybe just describing what needs to be done might help
others to help/contribute to that.

How many FILTERs need to be implemented?

One paragraph for each FILTER to explain what that does, how it could
be implemented with Spatial4j and implications on optimizer/query
execution would IMHO help.

Is there a JIRA issue for this?

An issue vaguely related is JENA-144:
https://issues.apache.org/jira/browse/JENA-144

See also SIS-42 (but I do not think there has been progress on that):
https://issues.apache.org/jira/browse/SIS-42

Paolo

On 12 September 2012 22:42, Rob Vesse <rv...@yarcdata.com> wrote:
> I remember some discussions a while back about one of the barriers to implementing GeoSPARQL in Jena being the lack of a good indexing library to use
>
> I notice that Lucene 4.0 has a new Spatial module - http://lucene.apache.org/core/4_0_0-BETA/spatial/index.html – which is itself built on another library Spatial4j which is ASL licensed
>
> Would these be sufficient pieces to get us started?  I haven't looked in detail as to whether these libraries provide the specific geospatial primitives and functions we'd need to implement GeoSPARQL
>
> Rob

Re: Spatial Indexing library for GeoSPARQL

Posted by Stephen Allen <sa...@apache.org>.
On Thu, Sep 13, 2012 at 5:25 AM, Marco Neumann <ma...@gmail.com> wrote:
> On Thu, Sep 13, 2012 at 8:19 AM, Andy Seaborne <an...@apache.org> wrote:
>
>> On 12/09/12 23:10, Marco Neumann wrote:
>>
>>> I would say yes the interesting bits are done by JTS. we used another LGPL
>>> index for geosparql.org.
>>>
>>> I think Jena deserves a dedicated file based indexer to support the full
>>> OGC geosparql standard but that said the task should not be
>>> underestimated.
>>>
>>
>> (a certain amount for recall from my last look at GeoSPARQL ...)
>>
>> There are four parts:
>>
>> 1/ Persistent index (r-tree, quadtree,...)
>>
>> 2/ Algorithms + Geo objects layer
>>   inc parsers and a Java library to handle geo data
>>
>> 3/ GeoSPARQL specifics - transformation, query rewrite etc etc
>>
>> 4/ Creating the index, either externally or coupled to a dataset.
>>
>> There don't all have to be perfect and complete on day 1 to provide users
>> with useful GeoSPARQL capability.  For example, an in-memory index gets a
>> certain amount of scale over brute force search of all data and not all
>> uses are a billion polygons.
>>
>>
>> Parliament?
>> opensahara->useekme (uses JTS inside?)
>>
>
> parliament uses postgres for the spatial filter
>

Actually Parliament can use several back-ends.  I believe the default
at this point is a library from the Deegree project (LGPL).  See the
previous thread [1] for a little more background.

-Stephen

[1] http://markmail.org/message/6v7q6nhpd2xleyvs

Re: Spatial Indexing library for GeoSPARQL

Posted by Marco Neumann <ma...@gmail.com>.
On Thu, Sep 13, 2012 at 8:19 AM, Andy Seaborne <an...@apache.org> wrote:

> On 12/09/12 23:10, Marco Neumann wrote:
>
>> I would say yes the interesting bits are done by JTS. we used another LGPL
>> index for geosparql.org.
>>
>> I think Jena deserves a dedicated file based indexer to support the full
>> OGC geosparql standard but that said the task should not be
>> underestimated.
>>
>
> (a certain amount for recall from my last look at GeoSPARQL ...)
>
> There are four parts:
>
> 1/ Persistent index (r-tree, quadtree,...)
>
> 2/ Algorithms + Geo objects layer
>   inc parsers and a Java library to handle geo data
>
> 3/ GeoSPARQL specifics - transformation, query rewrite etc etc
>
> 4/ Creating the index, either externally or coupled to a dataset.
>
> There don't all have to be perfect and complete on day 1 to provide users
> with useful GeoSPARQL capability.  For example, an in-memory index gets a
> certain amount of scale over brute force search of all data and not all
> uses are a billion polygons.
>
>
> Parliament?
> opensahara->useekme (uses JTS inside?)
>

parliament uses postgres for the spatial filter

>
> Lucene spatial? (I presume, not having looked, this is a z-ordering - we
> could adapt the B+Trees to do it [1]
>

take a look at my geosparql implementation it uses a an open source LGPL
lib


>
>         Andy
>
> (yes - I'd love to have a go at the persistent index if I could get some
> sort of funding :-)
>


I agree we need a roadmap and some funding for this. If you are in London
next week for SemTech UK we can start the discussion for this effort.

Marco



>
> [1] http://en.wikipedia.org/wiki/**UB-tree<http://en.wikipedia.org/wiki/UB-tree>
> but
>
> http://www.scholarpedia.org/**article/B-tree_and_UB-tree#UB-**
> trees_for_Multidimensional_**Applications<http://www.scholarpedia.org/article/B-tree_and_UB-tree#UB-trees_for_Multidimensional_Applications>
>
> has pictures!
>
>
>
>>
>>
>>
>>
>> On Wed, Sep 12, 2012 at 5:59 PM, Rob Vesse <rv...@yarcdata.com> wrote:
>>
>>  If I read the documentation correctly it can optionally use the JTS
>>> library (which yes is LGPL and so no go for Apache projects) if that
>>> library is needed, it can be used without.
>>>
>>> I'm not sure if the extra features that JTS provides are necessary for a
>>> GeoSPARQL implementation because I'm not up to speed on exactly what
>>> GeoSPARQL requires
>>>
>>> Rob
>>>
>>>
>>> On 9/12/12 2:51 PM, "Marco Neumann" <ma...@gmail.com> wrote:
>>>
>>>  it uses the JTS Topology Suite indexer which hasn't been updated for a
>>>> while but is open source under the LGPL license.
>>>>
>>>>
>>>>
>>>> On Wed, Sep 12, 2012 at 5:42 PM, Rob Vesse <rv...@yarcdata.com> wrote:
>>>>
>>>>  I remember some discussions a while back about one of the barriers to
>>>>> implementing GeoSPARQL in Jena being the lack of a good indexing
>>>>> library to
>>>>> use
>>>>>
>>>>> I notice that Lucene 4.0 has a new Spatial module -
>>>>> http://lucene.apache.org/core/**4_0_0-BETA/spatial/index.html<http://lucene.apache.org/core/4_0_0-BETA/spatial/index.html>­ which is
>>>>> itself built on another library Spatial4j which is ASL licensed
>>>>>
>>>>> Would these be sufficient pieces to get us started?  I haven't looked
>>>>> in
>>>>> detail as to whether these libraries provide the specific geospatial
>>>>> primitives and functions we'd need to implement GeoSPARQL
>>>>>
>>>>> Rob
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>> ---
>>>> Marco Neumann
>>>> KONA
>>>>
>>>> Join us at SemTech Biz in New York City October 15-17, 2012 and save 15%
>>>> with code STMN
>>>> http://www.lotico.com/evt/**SemTechBizNYC2012<http://www.lotico.com/evt/SemTechBizNYC2012>
>>>>
>>>
>>>
>>>
>>
>>
>


-- 


---
Marco Neumann
KONA

Join us at SemTech Biz in New York City October 15-17, 2012 and save 15%
with code STMN
http://www.lotico.com/evt/SemTechBizNYC2012

Re: Spatial Indexing library for GeoSPARQL

Posted by Andy Seaborne <an...@apache.org>.
On 12/09/12 23:10, Marco Neumann wrote:
> I would say yes the interesting bits are done by JTS. we used another LGPL
> index for geosparql.org.
>
> I think Jena deserves a dedicated file based indexer to support the full
> OGC geosparql standard but that said the task should not be underestimated.

(a certain amount for recall from my last look at GeoSPARQL ...)

There are four parts:

1/ Persistent index (r-tree, quadtree,...)

2/ Algorithms + Geo objects layer
   inc parsers and a Java library to handle geo data

3/ GeoSPARQL specifics - transformation, query rewrite etc etc

4/ Creating the index, either externally or coupled to a dataset.

There don't all have to be perfect and complete on day 1 to provide 
users with useful GeoSPARQL capability.  For example, an in-memory index 
gets a certain amount of scale over brute force search of all data and 
not all uses are a billion polygons.


Parliament?
opensahara->useekme (uses JTS inside?)

Lucene spatial? (I presume, not having looked, this is a z-ordering - we 
could adapt the B+Trees to do it [1]

	Andy

(yes - I'd love to have a go at the persistent index if I could get some 
sort of funding :-)

[1] http://en.wikipedia.org/wiki/UB-tree
but

http://www.scholarpedia.org/article/B-tree_and_UB-tree#UB-trees_for_Multidimensional_Applications

has pictures!

>
>
>
>
>
> On Wed, Sep 12, 2012 at 5:59 PM, Rob Vesse <rv...@yarcdata.com> wrote:
>
>> If I read the documentation correctly it can optionally use the JTS
>> library (which yes is LGPL and so no go for Apache projects) if that
>> library is needed, it can be used without.
>>
>> I'm not sure if the extra features that JTS provides are necessary for a
>> GeoSPARQL implementation because I'm not up to speed on exactly what
>> GeoSPARQL requires
>>
>> Rob
>>
>>
>> On 9/12/12 2:51 PM, "Marco Neumann" <ma...@gmail.com> wrote:
>>
>>> it uses the JTS Topology Suite indexer which hasn't been updated for a
>>> while but is open source under the LGPL license.
>>>
>>>
>>>
>>> On Wed, Sep 12, 2012 at 5:42 PM, Rob Vesse <rv...@yarcdata.com> wrote:
>>>
>>>> I remember some discussions a while back about one of the barriers to
>>>> implementing GeoSPARQL in Jena being the lack of a good indexing
>>>> library to
>>>> use
>>>>
>>>> I notice that Lucene 4.0 has a new Spatial module -
>>>> http://lucene.apache.org/core/4_0_0-BETA/spatial/index.html ­ which is
>>>> itself built on another library Spatial4j which is ASL licensed
>>>>
>>>> Would these be sufficient pieces to get us started?  I haven't looked in
>>>> detail as to whether these libraries provide the specific geospatial
>>>> primitives and functions we'd need to implement GeoSPARQL
>>>>
>>>> Rob
>>>>
>>>
>>>
>>>
>>> --
>>>
>>>
>>> ---
>>> Marco Neumann
>>> KONA
>>>
>>> Join us at SemTech Biz in New York City October 15-17, 2012 and save 15%
>>> with code STMN
>>> http://www.lotico.com/evt/SemTechBizNYC2012
>>
>>
>
>


Re: Spatial Indexing library for GeoSPARQL

Posted by Marco Neumann <ma...@gmail.com>.
I would say yes the interesting bits are done by JTS. we used another LGPL
index for geosparql.org.

I think Jena deserves a dedicated file based indexer to support the full
OGC geosparql standard but that said the task should not be underestimated.





On Wed, Sep 12, 2012 at 5:59 PM, Rob Vesse <rv...@yarcdata.com> wrote:

> If I read the documentation correctly it can optionally use the JTS
> library (which yes is LGPL and so no go for Apache projects) if that
> library is needed, it can be used without.
>
> I'm not sure if the extra features that JTS provides are necessary for a
> GeoSPARQL implementation because I'm not up to speed on exactly what
> GeoSPARQL requires
>
> Rob
>
>
> On 9/12/12 2:51 PM, "Marco Neumann" <ma...@gmail.com> wrote:
>
> >it uses the JTS Topology Suite indexer which hasn't been updated for a
> >while but is open source under the LGPL license.
> >
> >
> >
> >On Wed, Sep 12, 2012 at 5:42 PM, Rob Vesse <rv...@yarcdata.com> wrote:
> >
> >> I remember some discussions a while back about one of the barriers to
> >> implementing GeoSPARQL in Jena being the lack of a good indexing
> >>library to
> >> use
> >>
> >> I notice that Lucene 4.0 has a new Spatial module -
> >> http://lucene.apache.org/core/4_0_0-BETA/spatial/index.html ­ which is
> >> itself built on another library Spatial4j which is ASL licensed
> >>
> >> Would these be sufficient pieces to get us started?  I haven't looked in
> >> detail as to whether these libraries provide the specific geospatial
> >> primitives and functions we'd need to implement GeoSPARQL
> >>
> >> Rob
> >>
> >
> >
> >
> >--
> >
> >
> >---
> >Marco Neumann
> >KONA
> >
> >Join us at SemTech Biz in New York City October 15-17, 2012 and save 15%
> >with code STMN
> >http://www.lotico.com/evt/SemTechBizNYC2012
>
>


-- 


---
Marco Neumann
KONA

Join us at SemTech Biz in New York City October 15-17, 2012 and save 15%
with code STMN
http://www.lotico.com/evt/SemTechBizNYC2012

Re: Spatial Indexing library for GeoSPARQL

Posted by Rob Vesse <rv...@yarcdata.com>.
If I read the documentation correctly it can optionally use the JTS
library (which yes is LGPL and so no go for Apache projects) if that
library is needed, it can be used without.

I'm not sure if the extra features that JTS provides are necessary for a
GeoSPARQL implementation because I'm not up to speed on exactly what
GeoSPARQL requires

Rob


On 9/12/12 2:51 PM, "Marco Neumann" <ma...@gmail.com> wrote:

>it uses the JTS Topology Suite indexer which hasn't been updated for a
>while but is open source under the LGPL license.
>
>
>
>On Wed, Sep 12, 2012 at 5:42 PM, Rob Vesse <rv...@yarcdata.com> wrote:
>
>> I remember some discussions a while back about one of the barriers to
>> implementing GeoSPARQL in Jena being the lack of a good indexing
>>library to
>> use
>>
>> I notice that Lucene 4.0 has a new Spatial module -
>> http://lucene.apache.org/core/4_0_0-BETA/spatial/index.html ­ which is
>> itself built on another library Spatial4j which is ASL licensed
>>
>> Would these be sufficient pieces to get us started?  I haven't looked in
>> detail as to whether these libraries provide the specific geospatial
>> primitives and functions we'd need to implement GeoSPARQL
>>
>> Rob
>>
>
>
>
>-- 
>
>
>---
>Marco Neumann
>KONA
>
>Join us at SemTech Biz in New York City October 15-17, 2012 and save 15%
>with code STMN
>http://www.lotico.com/evt/SemTechBizNYC2012


Re: Spatial Indexing library for GeoSPARQL

Posted by Marco Neumann <ma...@gmail.com>.
it uses the JTS Topology Suite indexer which hasn't been updated for a
while but is open source under the LGPL license.



On Wed, Sep 12, 2012 at 5:42 PM, Rob Vesse <rv...@yarcdata.com> wrote:

> I remember some discussions a while back about one of the barriers to
> implementing GeoSPARQL in Jena being the lack of a good indexing library to
> use
>
> I notice that Lucene 4.0 has a new Spatial module -
> http://lucene.apache.org/core/4_0_0-BETA/spatial/index.html – which is
> itself built on another library Spatial4j which is ASL licensed
>
> Would these be sufficient pieces to get us started?  I haven't looked in
> detail as to whether these libraries provide the specific geospatial
> primitives and functions we'd need to implement GeoSPARQL
>
> Rob
>



-- 


---
Marco Neumann
KONA

Join us at SemTech Biz in New York City October 15-17, 2012 and save 15%
with code STMN
http://www.lotico.com/evt/SemTechBizNYC2012