You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sis.apache.org by Martin Desruisseaux <de...@apache.org> on 2014/05/23 20:10:18 UTC

Feature completed & micro-benchmarks

Hello all

The org.apache.sis.feature package [1] is now complete, except for bug
fixes as we discover them and for GeoAPI compliance based on what will
emerge at OGC.

The implementation details are not trivial, but this complexity should
be hidden from the API. To justify why this complexity, I have run a
micro-benchmark. A major design goal was to reduce memory usage.
Consider a ShapeFile or a database table with millions of records. Each
record is represented by one Feature instance. Sophisticated DataStore
implementations will create and discard Feature instances on the fly,
but not all DataStore do that or some applications may keep the features
in memory for whatever reason. As a safety, Apache SIS tries to
implement Feature in a way that allow applications to scale higher
before to die with an OutOfMemoryError.

A straightforward Feature implementation would use a java.util.HashMap
as below:

    class SimpleFeature {
        final Map<String,Object> attributes = new HashMap<>();
    }

This is indeed how the previous 'DefaultFeature' class were implemented
prior the recent work. Note that this is sometime called "simple
feature" since it does not provide explicit support of multi-valued
properties (admittedly, it has implicit support if we store multi-values
as java.util.Collection).

Consider a trivial database table with only 3 columns:

* city name (String of 8 characters)
* latitude (float)
* longitude (float)

Now let try to load records in a JVM limited to 100 Mb of memory. The
number of SimpleFeature instances than we can load before to get an
OutOfMemoryError is about 320,000 in my tests, while the amount of
Apache SIS feature instances than we can load is about 640,000. This is
twice as much, despite the fact that Apache SIS has all the
functionality of a "complex" feature (type inheritance, explicit
association roles and multi-valued properties).

I tried to provided a little bit more details in [2].

    Martin


[1]
https://builds.apache.org/job/sis-dev/javadoc/org/apache/sis/feature/package-summary.html
[2]
https://svn.apache.org/repos/asf/sis/trunk/core/sis-feature/src/main/java/org/apache/sis/feature/benchmarks.html