You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@polygene.apache.org by Niclas Hedhman <ni...@hedhman.org> on 2017/06/11 03:41:24 UTC

JOOQ EntityStore

Gang,

This is just a heads-up and request for feedback.... I am very close to
finish the first pass at the JOOQ Entity Store, which uses JOOQ to handle
SQL details across SQL dialects.

The intent is that an "enterprisey" developer would go, "Yeah, that could
work..." rather than the current "Use SQL as Key/Value store" approach that
we did initially.

The structure is as follows (JOOQ generated queries);

The TYPES table keeps mapping between Mixin types and table names. In
general, the Class.getSimpleName() is used as table name, but if there is a
conflict, then a "_1", "_2" and so on is added.

create table if not exists "POLYGENE"."TYPES"(
    "_identity" varchar null,
    "_table_name" varchar null,
    "_created_at" timestamp null,
    "_modified_at" timestamp null
)

The ENTITIES table is containing the meta data about the entities. This is
effectively the built-in data in EntityState. The "_value_id" field is the
foreign key into the generated mixin tables. This is done this way to
support the "Identity+Value" view of entities that we may have later, and
that history can be preserved (but not fully implemented yet), and the
"_deleted_at" field is there for this same reason, and also not used
initially.

create table if not exists "POLYGENE"."ENTITIES"(
    "_identity" varchar null,
    "_app_version" varchar null,
    "_value_id" varchar null,
    "_version" varchar null,
    "_type" varchar null,
    "_modified_at" timestamp null,
    "_created_at" timestamp null,
    "_deleted_at" timestamp null
)

Then for each Mixin type, two tables are created. The first one is to store
Properties and Associations. The second is to store the ManyAssociations
and NamedAssociations. I chose a single table, to reduce the amount of
tables that needs to be managed, but perhaps it should be one table per
many-to-many relation in the model.

The primary table for the ES testcase (TestEntity) looks like;

create table "POLYGENE"."TestEntity"(
    "_identity" varchar null,
    "_created_at" timestamp null,
    "instantValue" timestamp with time zone null,
    "bigIntegerValue" varchar null,
    "bigDecimalValue" varchar null,
    "dateTimeValue" timestamp with time zone null,
    "localDateTimeValue" timestamp null,
    "localDateValue" date null,
    "localTimeValue" time null,
    "duractionValue" varchar null,
    "periodValue" varchar null,
    "valueProperty" varchar null,
    "unsetName" varchar null,
    "emptyName" varchar null,
    "name" varchar null,
    "booleanValue" boolean null,
    "intValue" integer null,
    "longValue" bigint null,
    "floatValue" float null,
    "doubleValue" double null,
    "association" varchar null,
    "unsetAssociation" varchar null
)

and the assocs table;

create table if not exists "POLYGENE"."TestEntity_ASSOCS"(
    "_identity" varchar null,
    "_name" varchar null,
    "_index" varchar null,
    "_reference" varchar null
)

The "_name" is the name of the association in the mixin, say "children" for

    ManyAssociation<Person> children();

and the "_index" is the position inside the ManyAssociation or the key/name
of NamedAssociations.

For both of these tables, the "_identity" is the "_value_id" in the
ENTITIES table above.


Composite get() results in 2 queries, one for properties/assocs and one for
the _ASSOCS tables.

select *
    from "POLYGENE"."ENTITIES"
    left outer join "POLYGENE"."TestEntity" on "_value_id" =
"TestEntity"."_identity"
    where "ENTITIES"."_identity" = 'ca22d692-5461-4756-8316-5a214d055aa5'

select *
    from "POLYGENE"."ENTITIES"
    join "POLYGENE"."TestEntity_ASSOCS" on "_value_id" =
"TestEntity_ASSOCS"."_identity"
    where "ENTITIES"."_identity" = 'ca22d692-5461-4756-8316-5a214d055aa5'

In each of those queries, there is a "join" line per Mixin type, so highly
composed

The idea is that ValueComposites are still going to be JSON serialized into
fields and leave it in the hands of developers to decide whether to go with
more entities or use values. This serialization is not yet implemented and
the main part that is not working yet.
Other things still outstanding;
  * Primary Key constraints
  * Caching of TYPES, to reduce queries
  * Caching of queries (there is only 2 queries per primaryType)
  * Design improvements, not totally happy with the implementation.
  * History support
  * ALTER TABLE and Migration (big one)
  * Allow more JOOQ config to be specified

Interestingly enough, the current design (without history) preserves state
of sub-types, which I think is no longer supported in other ES
implementations. This is just a side-effect of how the storage is
structured.

So, this is on the brink of operational (6 of 8 tests in
AbstractEntityStoreTest passes), and before digging into the serialization
of Values, I would like to check with you guys if this sounds reasonable,
and if there are any ways to make this even more tempting.

I am for instance seeking to change the "entitystore-sql" to another name
(for instance sqlkv) or even remove it completely, and this store take the
"entitystore-sql" name. Otherwise, I think too many people will discard
Polygene after seeing "SQL ES" key/value storage pattern.

Personally, I think that this is fairly clean and something I think
SQL-happy people would consider an alternative to Hibernate.


I am committing/pushing this to branch "es-jooq" branch shortly, if you
want to take a closer look and feel free to help on implementation.


P.S. I am also keen on improving/expanding the ES test suite. It is
currently a bit rudimentary, and I think we should try to improve that.


Cheers
-- 
Niclas Hedhman, Software Developer
http://polygene.apache.org - New Energy for Java

Re: JOOQ EntityStore

Posted by Niclas Hedhman <ni...@hedhman.org>.
Well, sort of;

ENTITIES table has a "_value_id", that is primary key in each Mixin table
and the associated ASSOCS table. Then the ASSOCS table's "_reference"
column contains the Many/Named Associated EntityReference.

Hope that explains it well enough.



On Mon, Jun 12, 2017 at 9:34 PM, Kent Sølvsten <ke...@gmail.com>
wrote:

> Hi Niclas.
>
> Looks good.
>
> How do you handle the keys in named associations?
>
> If an entry in the ASSOCS_ table is pointing at a specific entity, does
> that mean, that for each relation there will be 2 entries in the table, one
> for each side?
>
> /Kent
>
> On Sun, Jun 11, 2017 at 5:41 AM, Niclas Hedhman <ni...@hedhman.org>
> wrote:
>
> > Gang,
> >
> > This is just a heads-up and request for feedback.... I am very close to
> > finish the first pass at the JOOQ Entity Store, which uses JOOQ to handle
> > SQL details across SQL dialects.
> >
> > The intent is that an "enterprisey" developer would go, "Yeah, that could
> > work..." rather than the current "Use SQL as Key/Value store" approach
> that
> > we did initially.
> >
> > The structure is as follows (JOOQ generated queries);
> >
> > The TYPES table keeps mapping between Mixin types and table names. In
> > general, the Class.getSimpleName() is used as table name, but if there
> is a
> > conflict, then a "_1", "_2" and so on is added.
> >
> > create table if not exists "POLYGENE"."TYPES"(
> >     "_identity" varchar null,
> >     "_table_name" varchar null,
> >     "_created_at" timestamp null,
> >     "_modified_at" timestamp null
> > )
> >
> > The ENTITIES table is containing the meta data about the entities. This
> is
> > effectively the built-in data in EntityState. The "_value_id" field is
> the
> > foreign key into the generated mixin tables. This is done this way to
> > support the "Identity+Value" view of entities that we may have later, and
> > that history can be preserved (but not fully implemented yet), and the
> > "_deleted_at" field is there for this same reason, and also not used
> > initially.
> >
> > create table if not exists "POLYGENE"."ENTITIES"(
> >     "_identity" varchar null,
> >     "_app_version" varchar null,
> >     "_value_id" varchar null,
> >     "_version" varchar null,
> >     "_type" varchar null,
> >     "_modified_at" timestamp null,
> >     "_created_at" timestamp null,
> >     "_deleted_at" timestamp null
> > )
> >
> > Then for each Mixin type, two tables are created. The first one is to
> store
> > Properties and Associations. The second is to store the ManyAssociations
> > and NamedAssociations. I chose a single table, to reduce the amount of
> > tables that needs to be managed, but perhaps it should be one table per
> > many-to-many relation in the model.
> >
> > The primary table for the ES testcase (TestEntity) looks like;
> >
> > create table "POLYGENE"."TestEntity"(
> >     "_identity" varchar null,
> >     "_created_at" timestamp null,
> >     "instantValue" timestamp with time zone null,
> >     "bigIntegerValue" varchar null,
> >     "bigDecimalValue" varchar null,
> >     "dateTimeValue" timestamp with time zone null,
> >     "localDateTimeValue" timestamp null,
> >     "localDateValue" date null,
> >     "localTimeValue" time null,
> >     "duractionValue" varchar null,
> >     "periodValue" varchar null,
> >     "valueProperty" varchar null,
> >     "unsetName" varchar null,
> >     "emptyName" varchar null,
> >     "name" varchar null,
> >     "booleanValue" boolean null,
> >     "intValue" integer null,
> >     "longValue" bigint null,
> >     "floatValue" float null,
> >     "doubleValue" double null,
> >     "association" varchar null,
> >     "unsetAssociation" varchar null
> > )
> >
> > and the assocs table;
> >
> > create table if not exists "POLYGENE"."TestEntity_ASSOCS"(
> >     "_identity" varchar null,
> >     "_name" varchar null,
> >     "_index" varchar null,
> >     "_reference" varchar null
> > )
> >
> > The "_name" is the name of the association in the mixin, say "children"
> for
> >
> >     ManyAssociation<Person> children();
> >
> > and the "_index" is the position inside the ManyAssociation or the
> key/name
> > of NamedAssociations.
> >
> > For both of these tables, the "_identity" is the "_value_id" in the
> > ENTITIES table above.
> >
> >
> > Composite get() results in 2 queries, one for properties/assocs and one
> for
> > the _ASSOCS tables.
> >
> > select *
> >     from "POLYGENE"."ENTITIES"
> >     left outer join "POLYGENE"."TestEntity" on "_value_id" =
> > "TestEntity"."_identity"
> >     where "ENTITIES"."_identity" = 'ca22d692-5461-4756-8316-
> 5a214d055aa5'
> >
> > select *
> >     from "POLYGENE"."ENTITIES"
> >     join "POLYGENE"."TestEntity_ASSOCS" on "_value_id" =
> > "TestEntity_ASSOCS"."_identity"
> >     where "ENTITIES"."_identity" = 'ca22d692-5461-4756-8316-
> 5a214d055aa5'
> >
> > In each of those queries, there is a "join" line per Mixin type, so
> highly
> > composed
> >
> > The idea is that ValueComposites are still going to be JSON serialized
> into
> > fields and leave it in the hands of developers to decide whether to go
> with
> > more entities or use values. This serialization is not yet implemented
> and
> > the main part that is not working yet.
> > Other things still outstanding;
> >   * Primary Key constraints
> >   * Caching of TYPES, to reduce queries
> >   * Caching of queries (there is only 2 queries per primaryType)
> >   * Design improvements, not totally happy with the implementation.
> >   * History support
> >   * ALTER TABLE and Migration (big one)
> >   * Allow more JOOQ config to be specified
> >
> > Interestingly enough, the current design (without history) preserves
> state
> > of sub-types, which I think is no longer supported in other ES
> > implementations. This is just a side-effect of how the storage is
> > structured.
> >
> > So, this is on the brink of operational (6 of 8 tests in
> > AbstractEntityStoreTest passes), and before digging into the
> serialization
> > of Values, I would like to check with you guys if this sounds reasonable,
> > and if there are any ways to make this even more tempting.
> >
> > I am for instance seeking to change the "entitystore-sql" to another name
> > (for instance sqlkv) or even remove it completely, and this store take
> the
> > "entitystore-sql" name. Otherwise, I think too many people will discard
> > Polygene after seeing "SQL ES" key/value storage pattern.
> >
> > Personally, I think that this is fairly clean and something I think
> > SQL-happy people would consider an alternative to Hibernate.
> >
> >
> > I am committing/pushing this to branch "es-jooq" branch shortly, if you
> > want to take a closer look and feel free to help on implementation.
> >
> >
> > P.S. I am also keen on improving/expanding the ES test suite. It is
> > currently a bit rudimentary, and I think we should try to improve that.
> >
> >
> > Cheers
> > --
> > Niclas Hedhman, Software Developer
> > http://polygene.apache.org - New Energy for Java
> >
>



-- 
Niclas Hedhman, Software Developer
http://polygene.apache.org - New Energy for Java

Re: JOOQ EntityStore

Posted by Kent Sølvsten <ke...@gmail.com>.
Hi Niclas.

Looks good.

How do you handle the keys in named associations?

If an entry in the ASSOCS_ table is pointing at a specific entity, does
that mean, that for each relation there will be 2 entries in the table, one
for each side?

/Kent

On Sun, Jun 11, 2017 at 5:41 AM, Niclas Hedhman <ni...@hedhman.org> wrote:

> Gang,
>
> This is just a heads-up and request for feedback.... I am very close to
> finish the first pass at the JOOQ Entity Store, which uses JOOQ to handle
> SQL details across SQL dialects.
>
> The intent is that an "enterprisey" developer would go, "Yeah, that could
> work..." rather than the current "Use SQL as Key/Value store" approach that
> we did initially.
>
> The structure is as follows (JOOQ generated queries);
>
> The TYPES table keeps mapping between Mixin types and table names. In
> general, the Class.getSimpleName() is used as table name, but if there is a
> conflict, then a "_1", "_2" and so on is added.
>
> create table if not exists "POLYGENE"."TYPES"(
>     "_identity" varchar null,
>     "_table_name" varchar null,
>     "_created_at" timestamp null,
>     "_modified_at" timestamp null
> )
>
> The ENTITIES table is containing the meta data about the entities. This is
> effectively the built-in data in EntityState. The "_value_id" field is the
> foreign key into the generated mixin tables. This is done this way to
> support the "Identity+Value" view of entities that we may have later, and
> that history can be preserved (but not fully implemented yet), and the
> "_deleted_at" field is there for this same reason, and also not used
> initially.
>
> create table if not exists "POLYGENE"."ENTITIES"(
>     "_identity" varchar null,
>     "_app_version" varchar null,
>     "_value_id" varchar null,
>     "_version" varchar null,
>     "_type" varchar null,
>     "_modified_at" timestamp null,
>     "_created_at" timestamp null,
>     "_deleted_at" timestamp null
> )
>
> Then for each Mixin type, two tables are created. The first one is to store
> Properties and Associations. The second is to store the ManyAssociations
> and NamedAssociations. I chose a single table, to reduce the amount of
> tables that needs to be managed, but perhaps it should be one table per
> many-to-many relation in the model.
>
> The primary table for the ES testcase (TestEntity) looks like;
>
> create table "POLYGENE"."TestEntity"(
>     "_identity" varchar null,
>     "_created_at" timestamp null,
>     "instantValue" timestamp with time zone null,
>     "bigIntegerValue" varchar null,
>     "bigDecimalValue" varchar null,
>     "dateTimeValue" timestamp with time zone null,
>     "localDateTimeValue" timestamp null,
>     "localDateValue" date null,
>     "localTimeValue" time null,
>     "duractionValue" varchar null,
>     "periodValue" varchar null,
>     "valueProperty" varchar null,
>     "unsetName" varchar null,
>     "emptyName" varchar null,
>     "name" varchar null,
>     "booleanValue" boolean null,
>     "intValue" integer null,
>     "longValue" bigint null,
>     "floatValue" float null,
>     "doubleValue" double null,
>     "association" varchar null,
>     "unsetAssociation" varchar null
> )
>
> and the assocs table;
>
> create table if not exists "POLYGENE"."TestEntity_ASSOCS"(
>     "_identity" varchar null,
>     "_name" varchar null,
>     "_index" varchar null,
>     "_reference" varchar null
> )
>
> The "_name" is the name of the association in the mixin, say "children" for
>
>     ManyAssociation<Person> children();
>
> and the "_index" is the position inside the ManyAssociation or the key/name
> of NamedAssociations.
>
> For both of these tables, the "_identity" is the "_value_id" in the
> ENTITIES table above.
>
>
> Composite get() results in 2 queries, one for properties/assocs and one for
> the _ASSOCS tables.
>
> select *
>     from "POLYGENE"."ENTITIES"
>     left outer join "POLYGENE"."TestEntity" on "_value_id" =
> "TestEntity"."_identity"
>     where "ENTITIES"."_identity" = 'ca22d692-5461-4756-8316-5a214d055aa5'
>
> select *
>     from "POLYGENE"."ENTITIES"
>     join "POLYGENE"."TestEntity_ASSOCS" on "_value_id" =
> "TestEntity_ASSOCS"."_identity"
>     where "ENTITIES"."_identity" = 'ca22d692-5461-4756-8316-5a214d055aa5'
>
> In each of those queries, there is a "join" line per Mixin type, so highly
> composed
>
> The idea is that ValueComposites are still going to be JSON serialized into
> fields and leave it in the hands of developers to decide whether to go with
> more entities or use values. This serialization is not yet implemented and
> the main part that is not working yet.
> Other things still outstanding;
>   * Primary Key constraints
>   * Caching of TYPES, to reduce queries
>   * Caching of queries (there is only 2 queries per primaryType)
>   * Design improvements, not totally happy with the implementation.
>   * History support
>   * ALTER TABLE and Migration (big one)
>   * Allow more JOOQ config to be specified
>
> Interestingly enough, the current design (without history) preserves state
> of sub-types, which I think is no longer supported in other ES
> implementations. This is just a side-effect of how the storage is
> structured.
>
> So, this is on the brink of operational (6 of 8 tests in
> AbstractEntityStoreTest passes), and before digging into the serialization
> of Values, I would like to check with you guys if this sounds reasonable,
> and if there are any ways to make this even more tempting.
>
> I am for instance seeking to change the "entitystore-sql" to another name
> (for instance sqlkv) or even remove it completely, and this store take the
> "entitystore-sql" name. Otherwise, I think too many people will discard
> Polygene after seeing "SQL ES" key/value storage pattern.
>
> Personally, I think that this is fairly clean and something I think
> SQL-happy people would consider an alternative to Hibernate.
>
>
> I am committing/pushing this to branch "es-jooq" branch shortly, if you
> want to take a closer look and feel free to help on implementation.
>
>
> P.S. I am also keen on improving/expanding the ES test suite. It is
> currently a bit rudimentary, and I think we should try to improve that.
>
>
> Cheers
> --
> Niclas Hedhman, Software Developer
> http://polygene.apache.org - New Energy for Java
>

Re: JOOQ EntityStore

Posted by Tibor Mlynarik <ti...@gmail.com>.
Hi Niclas,

your correction put me in doubt if I understand db model correctly. 

What happens to ENTITIES._value_id and related TestEntity*._identity when entity updates ?
It has to be updated to new value across all 3 tables, right ? ( and such value has to be globally unique)

	Tibor



> On Jun 11, 2017, at 7:24 AM, Niclas Hedhman <ni...@hedhman.org> wrote:
> 
> On Sun, Jun 11, 2017 at 11:41 AM, Niclas Hedhman <ni...@hedhman.org> wrote:
>> 
>> Composite get() results in 2 queries, one for properties/assocs and one
> for the _ASSOCS tables.
>> 
>> select *
>>    from "POLYGENE"."ENTITIES"
>>    left outer join "POLYGENE"."TestEntity" on "_value_id" =
> "TestEntity"."_identity"
>>    where "ENTITIES"."_identity" = 'ca22d692-5461-4756-8316-5a214d055aa5'
>> 
>> select *
>>    from "POLYGENE"."ENTITIES"
>>    join "POLYGENE"."TestEntity_ASSOCS" on "_value_id" =
> "TestEntity_ASSOCS"."_identity"
>>    where "ENTITIES"."_identity" = 'ca22d692-5461-4756-8316-5a214d055aa5'
>> 
>> In each of those queries, there is a "join" line per Mixin type, so
> highly composed
> 
> CORRECTION; There is an additional query for the ENTITIES table itself, to
> obtain the "_value_id". This might also be more cached than currently is,
> as it is fetched quite often.
> 
> 
> 
> Cheers
> --
> Niclas Hedhman, Software Developer
> http://polygene.apache.org - New Energy for Java


Re: JOOQ EntityStore

Posted by Niclas Hedhman <ni...@hedhman.org>.
On Sun, Jun 11, 2017 at 11:41 AM, Niclas Hedhman <ni...@hedhman.org> wrote:
>
> Composite get() results in 2 queries, one for properties/assocs and one
for the _ASSOCS tables.
>
> select *
>     from "POLYGENE"."ENTITIES"
>     left outer join "POLYGENE"."TestEntity" on "_value_id" =
"TestEntity"."_identity"
>     where "ENTITIES"."_identity" = 'ca22d692-5461-4756-8316-5a214d055aa5'
>
> select *
>     from "POLYGENE"."ENTITIES"
>     join "POLYGENE"."TestEntity_ASSOCS" on "_value_id" =
"TestEntity_ASSOCS"."_identity"
>     where "ENTITIES"."_identity" = 'ca22d692-5461-4756-8316-5a214d055aa5'
>
> In each of those queries, there is a "join" line per Mixin type, so
highly composed

CORRECTION; There is an additional query for the ENTITIES table itself, to
obtain the "_value_id". This might also be more cached than currently is,
as it is fetched quite often.



Cheers
--
Niclas Hedhman, Software Developer
http://polygene.apache.org - New Energy for Java

Re: JOOQ EntityStore

Posted by Paul Merlin <pa...@apache.org>.
Hi Niclas,

Yep, looks good.


> Interestingly enough, the current design (without history) preserves state
> of sub-types, which I think is no longer supported in other ES
> implementations. This is just a side-effect of how the storage is
> structured.

IIRC, all JSONMapEntityStore support this, JSONEntityState gets the full
state and changes are surgically applied. PreferencesES should too.
Don't know about Cassandra.

> [snip]
>
> P.S. I am also keen on improving/expanding the ES test suite. It is
> currently a bit rudimentary, and I think we should try to improve that.

The above is worth adding to the test suite. Most of our EntityStores
preserve polymorphic state. I'd say that EntityStores that don't support
this should explicitly opt out of it by ignoring the corresponding test.