You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sis.apache.org by Andrew Hart <ah...@apache.org> on 2011/05/26 21:54:45 UTC

SIS scalability

Hello,

I was wondering if there is any available information on how SIS 
performs with large amounts of data, say, 100s of millions to a billion 
georss records. Has anyone had experience with SIS at this scale?

Also, does SIS currently offer support for GeoRSS-simple constructs like 
point, line, polygon, box, circle [1]? If "higher level" constructs like 
lines and polygons are supported, what is the behavior when there is 
partial overlap with the region of interest?

Thanks!

Andrew.


[1] http://www.georss.org/simple


Re: SIS scalability

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi there Andrew,

On May 26, 2011, at 12:54 PM, Andrew Hart wrote:

> Hello,
> 
> I was wondering if there is any available information on how SIS 
> performs with large amounts of data, say, 100s of millions to a billion 
> georss records. Has anyone had experience with SIS at this scale?

Good question. SIS was originally based on the LocalLucene code, (e.g., SIS-1 [1]), which was then refactored and significantly adapted by Nga Chung (e..g, see SIS-3 [2]) and released along with SIS 0.1-incubating. LocalLucene has been tested (and found effective) at that scale.

I don't think anyone has tested the latest code with that many GeoRSS records though, and I honestly doubt in its current form (which uses Sun's GeoRSS parser and I can't remember if it uses SAX or not) that it will easily scale up to that many records without some fine tuning.

That said, if you can better elucidate your specific use case, we can always figure out the best way to support it with the current software (e.g., use of multiple qtree indices and location servlets) or identify areas to improve on so that it can more easily support this type of scalability out of the box.
> 
> Also, does SIS currently offer support for GeoRSS-simple constructs like 
> point, line, polygon, box, circle [1]? If "higher level" constructs like 
> lines and polygons are supported, what is the behavior when there is 
> partial overlap with the region of interest?

SIS currently supports box and point. Not sure if it supports circle or polygon yet (my guess is no). Most of the code that performs the searching can be found in the LocationServlet [3] source code. It's pretty decently documented. I think adding additional support for further spatial constructs (like spatial data structures) would be useful. Supporting polygons and lines are on the SIS roadmap and we've talked about doing it (see the original proposal [4]). 

Cheers,
Chris

[1] http://issues.apache.org/jira/browse/SIS-1
[2] http://issues.apache.org/jira/browse/SIS-3
[3] http://svn.apache.org/repos/asf/incubator/sis/trunk/sis-webapp/src/main/java/org/apache/sis/services/LocationServlet.java
[4] http://wiki.apache.org/incubator/SpatialProposal

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++