You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@polygene.apache.org by Niclas Hedhman <ni...@hedhman.org> on 2017/02/19 17:37:52 UTC

Cassandra ES

Gang,

I got to the point where the Apache Cassandra EntityStore is passing the
standard testcases.

I have no idea on performance of this yet, but since we know from
MemoryEntityStore, the serialization eats up a lot of the time, so... my
guess is that serialization is a bigger bottleneck than Cassandra itself.

I decided to not build on MapEntityStore, and serialize directly to fields
in Cassandra. See schema below. It has been quite sweet to work with the
CQL compared to the old native Thrift API.

Things left to do;
  * Do we need to handle the "Session", so that if there are communication
problems, we need to kill it and create a new one? I have assumed that the
Datastax client handles this automatically.

  * Is the balance of configuration in Polygene config reasonable, or
should more stuff that can be configured be moved? ATM, it is possible to
programmatically add additional config options.

  * Performance testing.

  * More documentation?

  * Live action - I am working on a project to use this within the next few
months.

  * ????

If anyone has had extensive experience with Cassandra, then let me know if
it makes sense to have an "entity" table as follows;

CREATE TABLE entities (
    id text,
    type text,
    version text,
    appversion text,
    storeversion text,
    usecase text,
    modified timestamp,
    properties map<string,string>
    assocs map<string,string>
    manyassocs map<string,string>
    namedassocs map<string,string>
    PRIMARY KEY ( id )
);

id = entity identity as string
type = primary type of the entity
version = save version of entity
appversion = version of the application (to support Migration)
storeversion = the Cassandra ES implementation, in case we change the store
format.
usecase = the usecase that created or last updated the entity

properties = map with property name as key and the serialized value as value
assocs = map with association name as the key and the identity string as
value
manyassocs = map of many associations with the name as key and a
comma-separated list of identity strings. (should may JSON serialize it
instead, to avoid comma problems)
namedassocs = key is name of the named association and the value is a
serialized string of a map with the name/identity.


-- 
Niclas Hedhman, Software Developer
http://polygene.apache.org <http://zest.apache.org> - New Energy for Java