You are viewing a plain text version of this content. The canonical link for it is here.
Posted to torque-dev@db.apache.org by "Henning P. Schmiedehausen" <hp...@intermeta.de> on 2004/08/24 17:12:03 UTC
PostgreSQL ID generation
Hi,
currently, PostgreSQL uses a kind of "autoincrement in the database"
for "native" id generation.
It uses a sequence, created e.g. with
create sequence foo_bar_seq;
and then teaches the tables to do autoincrementing a la MySQL:
create table foo (
bar integer default nextval('foo_bar_seq')
primary key (bar)
);
While this works well, it is not compatible to e.g. Hibernate which
wants to do its own primary key management with a sequence.
It also makes it impossible to build "better" (e.g. a hi/lo) sequence
generator because the getIDMethodSQL method from the adapter returns
"select currval('<sequence name>')".
I'd propose to change the PostgreSQL adapter to use the native
sequence model. It does the following things:
- Changes the native type of ID generation for PostgreSQL from
AUTO_INCREMENT to SEQUENCE.
- Makes the DBPostgres adapter return "select nextval('<sequence name>')
from getIDMethodSQL
- Removes the DEFAULT nextval(..) statement from the ID columns in table
generation
- Allows AUTO_INCREMENT columns in all tables, regardless whether the
ID mode is native or not. A "serial" column is really an auto incrementing
column, no matter how the id elements are generated.
Please note that the current sql generation makes no sense. If we create
or SQL with the DEFAULT nextval('sequence'), why restrict serials to tables
that use native Ids? They don't use this anyway.
This is the natural model of ID generation with PostgreSQL and is
compatible to e.g. Hibernate (you can use a Hibernate Application and
a Torque Application without any changes in the database schema.
I'd really like to see this patch go in.
On a related note: I don't understand why the sequence creation has
been dropped from the schema generation? They could be generated in a
second file, but as it currently is, the regular sequence of "write a
schema, torque:om, torque:sql, torque:create-db, torque:insert-sql,
run your application" no longer works because the sequences are
missing.
I'd very much love to see this revision:
http://cvs.apache.org/viewcvs.cgi/db-torque/src/generator/src/templates/sql/base/postgresql/table.vm?r1=1.1&r2=1.1.2.1&only_with_tag=TORQUE_3_1_BRANCH
reverted.
Regards
Henning
Index: src/java/org/apache/torque/adapter/DBPostgres.java
===================================================================
--- src/java/org/apache/torque/adapter/DBPostgres.java (revision 2575)
+++ src/java/org/apache/torque/adapter/DBPostgres.java (working copy)
@@ -65,7 +65,7 @@
*/
public String getIDMethodType()
{
- return AUTO_INCREMENT;
+ return SEQUENCE;
}
/**
@@ -76,7 +76,7 @@
*/
public String getIDMethodSQL(Object name)
{
- return ("select currval('" + name + "')");
+ return ("select nextval('" + name + "')");
}
/**
Index: src/generator/src/templates/sql/base/postgresql/columns.vm
===================================================================
--- src/generator/src/templates/sql/base/postgresql/columns.vm (revision 2575)
+++ src/generator/src/templates/sql/base/postgresql/columns.vm (working copy)
@@ -9,14 +9,10 @@
#else
#set ( $size = "" )
#end
- #if (($table.IdMethod == "native") && ($col.isPrimaryKey()))
- #set ( $default = "DEFAULT nextval('$table.SequenceName')" )
- #else
- #set ( $default = $col.DefaultSetting )
- #end
+ #set ( $default = $col.DefaultSetting )
#set ( $nullString = $strings.select($col.isNotNull(), $dbprops.get("NOTNULL"),"") )
#set ( $entry = "$col.Name $type $size $default $nullString" )
- #if ($col.isAutoIncrement() && $table.IdMethod == "native")
+ #if ($col.isAutoIncrement())
#set ( $autoIncrement = $dbprops.get("AUTOINCREMENT") )
#else
#set ( $autoIncrement = " " )
--
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen INTERMETA GmbH
hps@intermeta.de +49 9131 50 654 0 http://www.intermeta.de/
RedHat Certified Engineer -- Jakarta Turbine Development -- hero for hire
Linux, Java, perl, Solaris -- Consulting, Training, Development
"Fighting for one's political stand is an honorable action, but re-
fusing to acknowledge that there might be weaknesses in one's
position - in order to identify them so that they can be remedied -
is a large enough problem with the Open Source movement that it
deserves to be on this list of the top five problems."
-- Michelle Levesque, "Fundamental Issues with
Open Source Software Development"
---------------------------------------------------------------------
To unsubscribe, e-mail: torque-dev-unsubscribe@db.apache.org
For additional commands, e-mail: torque-dev-help@db.apache.org
Re: PostgreSQL ID generation
Posted by "Henning P. Schmiedehausen" <hp...@intermeta.de>.
Scott Eade <se...@backstagetech.com.au> writes:
> 2. PostgreSQL has a serial data type that Torque already supports to
Yep.
> a limited extent. IMHO we should fully embrace this by:
> 1. Generating schema sql that used the SERIAL datatype rather
> than "default nextval('foo_bar_seq')" (the latter with a
> different sequence name structure is what PostgreSQL
> maintains automatically when the SERIAL datatype is used).
> 2. Leaving sequence naming (with the available override),
> sequence creation and destruction (drop) to PostgreSQL.
> 3. Enhancing the JDBC target to detect the use of sequences so
> as to produce schema files that use SERIAL.
> 4. Determine whether or not the id-method-parameter element has
> any use.
The problem with the torque schema and postgresql is, that there are many
different ways to generate an ID here:
- Using ID_TABLE: using a regular number type like INTEGER or BIGINT
as index, generating the ID with Torque
- Using a serial datatype: works like MySQL AUTO_INCREMENT, so the ID
generator must fetch the index column after an insert
- Using default nextval('foo_bar_seq'): Uses a Sequence but manages
the id insert process inside the database engine, the ID generator
must fetch the index column after an insert. From a database point of
view, this is the same as "auto_increment".
- Use a sequence and let the ID generator fetch the next value from the
sequence and then insert it into the table (sequence generator)
The last one is compatible to the Hibernate "sequence" scheme.
> 3. The column elements autoIncrement attribute in the schema is not
> clearly defined. Correct me if I am wrong, but from your comments
> and your patch it appears that you interpret a combination of
> idMethod="native" for the table and autoIncrement="true" on a
> column to be the triggers for the use of a sequence. I actually
The Sequence Generator which does "select nextval('sequence_name')"
pulls the ID from the sequence and then inserts it together with the
data just like the ID_TABLE generator does. So the ID column itself is
just a regular number type without any special properties (like serial or
default value).
> thought that primaryKey="true" had something to do with it and
> from the generated code this does in fact appear to be the case.
> Generate the sql for the following schema (comments on the resultant
> sql are included within):
> <!-- idbroker with autoIncrement=false on PK -->
> <!-- Resultant sql seems okay, not sure how IDBroker would behave. -->
> <table name="dog_idbroker" idMethod="idbroker">
> <column name="dog_id" required="true" autoIncrement="false"
> primaryKey="true" type="INTEGER"/>
> <column name="animal_id" required="true" autoIncrement="false"
> primaryKey="false" type="INTEGER"/>
> <column name="name" size="100" type="VARCHAR"/>
> </table>
Yes.
> <!-- native with autoIncrement=false on PK -->
> <!-- Resultant sql includes:
> dog_id integer DEFAULT nextval('dog_native_SEQ') NOT NULL
> Should be:
> dog_id integer NOT NULL
Yes.
> -->
> <table name="dog_native" idMethod="native">
> <column name="dog_id" required="true" autoIncrement="false"
> primaryKey="true" type="INTEGER"/>
> <column name="animal_id" required="true" autoIncrement="false"
> primaryKey="false" type="INTEGER"/>
> <column name="name" size="100" type="VARCHAR"/>
> <!--id-method-parameter name="seqName" value="DOG_NATIVE_DOG_ID_SEQ"/-->
> </table>
> <!-- idbroker with autoIncrement=true on PK -->
> <!-- Resultant sql includes:
> cat_id
> Should be:
> cat_id integer NOT NULL
If you want auto_increment, then this should probably be some sort of
serial type. This will conflict with the ID_BROKER.
[...]
>We could update Torque to consistently use the SERIAL datatype, and to
>only so this when autoIncrement is true. In addition to this it would
That is a good idea. However, we must check first, whether SERIAL is
allowed for all number column types (don't know, never needed SERIAL
before).
>be great if we could provide a means of specifying the sequence name for
>the particular column it is use on, perhaps seqName should be a column
>attribute (and perhaps an option to determine whether or not the
>sequence should be created and dropped would help out people with
>existing sequences).
>Basically as it stands it is a right mess and needs to be addressed
>before the next RC.
The problem is, that the different ID generators (auto_increment,
sequence, id_table) have different semantics on when to fetch the ID
values. So for postgresql, these must generate different column types.
idbroker, sequence ----> regular number type NOT NULL (e.g. BIGINT NOT NULL)
auto_increment ----> regular number type SERIAL
(SERIAL is translated internally to
CREATE SEQUENCE tablename_colname_seq;
CREATE TABLE tablename (
colname integer DEFAULT nextval('tablename_colname_seq') NOT NULL
);
according to http://www.postgresql.org/docs/7.3/static/datatype.html#DATATYPE-SERIAL)
when using "sequence", we should also create the sequence in the SQL.
What is left is to find out what "native" should really be. IMHO it
should be equivalent "sequence" because "auto_increment" creates
implicit object in the database outside user control. This is the patch
that I've put on the list.
There is also a note with "SERIAL":
--- cut --- http://www.postgresql.org/docs/7.3/static/datatype.html#DATATYPE-SERIAL --- cut ---
Note: Prior to PostgreSQL 7.3, serial implied UNIQUE. This is no
longer automatic. If you wish a serial column to be UNIQUE or a
PRIMARY KEY it must now be specified, just as with any other data
type.
--- cut --- http://www.postgresql.org/docs/7.3/static/datatype.html#DATATYPE-SERIAL --- cut ---
so we might need to build a switch for pre-7.3 PostgreSQL databases.
Regards
Henning
--
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen INTERMETA GmbH
hps@intermeta.de +49 9131 50 654 0 http://www.intermeta.de/
RedHat Certified Engineer -- Jakarta Turbine Development -- hero for hire
Linux, Java, perl, Solaris -- Consulting, Training, Development
"Fighting for one's political stand is an honorable action, but re-
fusing to acknowledge that there might be weaknesses in one's
position - in order to identify them so that they can be remedied -
is a large enough problem with the Open Source movement that it
deserves to be on this list of the top five problems."
-- Michelle Levesque, "Fundamental Issues with
Open Source Software Development"
---------------------------------------------------------------------
To unsubscribe, e-mail: torque-dev-unsubscribe@db.apache.org
For additional commands, e-mail: torque-dev-help@db.apache.org
Re: PostgreSQL ID generation
Posted by Scott Eade <se...@backstagetech.com.au>.
[Delayed response - several reasons...]
Henning P. Schmiedehausen wrote:
>... must think some more.
>
A few things strike me:
1. The schema reference (included in the Generator section of the
Torque site) is by no means complete and is scant on detail given
how important it is in the scheme of things (no pun intended).
This document should provide a decent specification of the effect
the various elements and attributes will have on the generated sql
and OM code rather than being a partial reference used to assist
with a trial and error decoding process.
2. PostgreSQL has a serial data type that Torque already supports to
a limited extent. IMHO we should fully embrace this by:
1. Generating schema sql that used the SERIAL datatype rather
than "default nextval('foo_bar_seq')" (the latter with a
different sequence name structure is what PostgreSQL
maintains automatically when the SERIAL datatype is used).
2. Leaving sequence naming (with the available override),
sequence creation and destruction (drop) to PostgreSQL.
3. Enhancing the JDBC target to detect the use of sequences so
as to produce schema files that use SERIAL.
4. Determine whether or not the id-method-parameter element has
any use.
3. The column elements autoIncrement attribute in the schema is not
clearly defined. Correct me if I am wrong, but from your comments
and your patch it appears that you interpret a combination of
idMethod="native" for the table and autoIncrement="true" on a
column to be the triggers for the use of a sequence. I actually
thought that primaryKey="true" had something to do with it and
from the generated code this does in fact appear to be the case.
Generate the sql for the following schema (comments on the resultant
sql are included within):
<!-- idbroker with autoIncrement=false on PK -->
<!-- Resultant sql seems okay, not sure how IDBroker would behave. -->
<table name="dog_idbroker" idMethod="idbroker">
<column name="dog_id" required="true" autoIncrement="false"
primaryKey="true" type="INTEGER"/>
<column name="animal_id" required="true" autoIncrement="false"
primaryKey="false" type="INTEGER"/>
<column name="name" size="100" type="VARCHAR"/>
</table>
<!-- native with autoIncrement=false on PK -->
<!-- Resultant sql includes:
dog_id integer DEFAULT nextval('dog_native_SEQ') NOT NULL
Should be:
dog_id integer NOT NULL
-->
<table name="dog_native" idMethod="native">
<column name="dog_id" required="true" autoIncrement="false"
primaryKey="true" type="INTEGER"/>
<column name="animal_id" required="true" autoIncrement="false"
primaryKey="false" type="INTEGER"/>
<column name="name" size="100" type="VARCHAR"/>
<!--id-method-parameter name="seqName" value="DOG_NATIVE_DOG_ID_SEQ"/-->
</table>
<!-- idbroker with autoIncrement=true on PK -->
<!-- Resultant sql includes:
cat_id
Should be:
cat_id integer NOT NULL
-->
<table name="cat_idbroker" idMethod="idbroker">
<column name="cat_id" required="true" autoIncrement="true"
primaryKey="true" type="INTEGER"/>
<column name="animal_id" required="true" autoIncrement="false"
primaryKey="false" type="INTEGER"/>
<column name="name" size="100" type="VARCHAR"/>
</table>
<!-- native with autoIncrement=true on PK -->
<!-- Resultant sql includes (IMO correct):
cat_id serial
-->
<table name="cat_native" idMethod="native">
<column name="cat_id" required="true" autoIncrement="true"
primaryKey="true" type="INTEGER"/>
<column name="animal_id" required="true" autoIncrement="false"
primaryKey="false" type="INTEGER"/>
<column name="name" size="100" type="VARCHAR"/>
<!--id-method-parameter name="seqName" value="CAT_NATIVE_CAT_ID_SEQ"/-->
</table>
<!-- idbroker with autoIncrement=true on non-PK -->
<!-- Resultant sql includes:
mouse_id integer NOT NULL,
animal_id
Should be:
mouse_id integer NOT NULL,
animal_id integer NOT NULL
-->
<table name="mouse_idbroker" idMethod="idbroker">
<column name="mouse_id" required="true" autoIncrement="false"
primaryKey="true" type="INTEGER"/>
<column name="animal_id" required="true" autoIncrement="true"
primaryKey="false" type="INTEGER"/>
<column name="name" size="100" type="VARCHAR"/>
</table>
<!-- native with autoIncrement=false on non-PK -->
<!-- Resultant sql includes:
mouse_id integer DEFAULT nextval('mouse_native_SEQ') NOT NULL,
animal_id serial
Should be:
mouse_id integer NOT NULL,
animal_id serial
-->
<table name="mouse_native" idMethod="native">
<column name="mouse_id" required="true" autoIncrement="false"
primaryKey="true" type="INTEGER"/>
<column name="animal_id" required="true" autoIncrement="true"
primaryKey="false" type="INTEGER"/>
<column name="name" size="100" type="VARCHAR"/>
<!--id-method-parameter name="seqName"
value="MOUSE_NATIVE_MOUSE_ID_SEQ"/-->
</table>
We could update Torque to consistently use the SERIAL datatype, and to
only so this when autoIncrement is true. In addition to this it would
be great if we could provide a means of specifying the sequence name for
the particular column it is use on, perhaps seqName should be a column
attribute (and perhaps an option to determine whether or not the
sequence should be created and dropped would help out people with
existing sequences).
Basically as it stands it is a right mess and needs to be addressed
before the next RC.
Scott
--
Scott Eade
Backstage Technologies Pty. Ltd.
http://www.backstagetech.com.au
---------------------------------------------------------------------
To unsubscribe, e-mail: torque-dev-unsubscribe@db.apache.org
For additional commands, e-mail: torque-dev-help@db.apache.org
Re: PostgreSQL ID generation
Posted by "Henning P. Schmiedehausen" <hp...@intermeta.de>.
Scott Eade <se...@backstagetech.com.au> writes:
>Henning P. Schmiedehausen wrote:
[... my message hung in moderation for a few days. ...]
>What am I missing Henning? If my schema is:
> <table name="category" idMethod="native">
> <column name="category_id" required="true" autoIncrement="true"
> primaryKey="true" type="INTEGER"/>
> <column name="name" size="100" type="VARCHAR"/>
> <id-method-parameter name="seqName" value="CATEGORY_CATEGORY_ID_SEQ"/>
> </table>
>The generated sql is:
>-----------------------------------------------------------------------------
>-- category
>-----------------------------------------------------------------------------
>DROP TABLE category CASCADE;
>CREATE TABLE category
>(
> category_id serial,
> name varchar (100),
> PRIMARY KEY (category_id)
>);
>When this is executed PostgreSQL automatically creates a sequence (from
>pgAdmin III):
>CREATE SEQUENCE public.category_category_id_seq
> INCREMENT 1
> MINVALUE 1
> MAXVALUE 9223372036854775807
> START 1
> CACHE 1;
With the patch mentioned above applied, the id column will be a
regular integer column without any serial modifiers or relation to a
given sequence. The ID Generator will use "select nextval('foo_seq')"
and foo_seq is defined nowhere.
I don't like "autoIncrement = true" for the ID Column because the ID
management should (IMHO) done by the ID generator. Personally, I don't
like MySQL AUTO_INCREMENT columns very much. These are handy for a 20
line script where you don't want to do the ID generation all by
yourself, but if we have an O/R layer which contains ID Generators,
these IDs should be either generated with the ID Table using a Hi/Lo
scheme or with a sequence and select nextval.
Just reverting the mentioned revision won't work either, because if
you create a database schema just like the one that I use (where a lot
of tables use the same sequence for ID generation, ensuring that an ID
is unique in the whole database), then the resulting sql will contain
a lot of
create sequence foo_seq;
create sequence foo_seq;
statements which leads to (probably non-fatal but annoying) errors
when inserting this SQL.
... must think some more.
Regards
Henning
--
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen INTERMETA GmbH
hps@intermeta.de +49 9131 50 654 0 http://www.intermeta.de/
RedHat Certified Engineer -- Jakarta Turbine Development -- hero for hire
Linux, Java, perl, Solaris -- Consulting, Training, Development
"Fighting for one's political stand is an honorable action, but re-
fusing to acknowledge that there might be weaknesses in one's
position - in order to identify them so that they can be remedied -
is a large enough problem with the Open Source movement that it
deserves to be on this list of the top five problems."
-- Michelle Levesque, "Fundamental Issues with
Open Source Software Development"
---------------------------------------------------------------------
To unsubscribe, e-mail: torque-dev-unsubscribe@db.apache.org
For additional commands, e-mail: torque-dev-help@db.apache.org
Re: PostgreSQL ID generation
Posted by Scott Eade <se...@backstagetech.com.au>.
Henning P. Schmiedehausen wrote:
>Hi,
>
>currently, PostgreSQL uses a kind of "autoincrement in the database"
>for "native" id generation.
>
>It uses a sequence, created e.g. with
>
>create sequence foo_bar_seq;
>
>and then teaches the tables to do autoincrementing a la MySQL:
>
>create table foo (
> bar integer default nextval('foo_bar_seq')
> primary key (bar)
>);
>
>While this works well, it is not compatible to e.g. Hibernate which
>wants to do its own primary key management with a sequence.
>
>It also makes it impossible to build "better" (e.g. a hi/lo) sequence
>generator because the getIDMethodSQL method from the adapter returns
>"select currval('<sequence name>')".
>
>I'd propose to change the PostgreSQL adapter to use the native
>sequence model. It does the following things:
>
>- Changes the native type of ID generation for PostgreSQL from
> AUTO_INCREMENT to SEQUENCE.
>
>- Makes the DBPostgres adapter return "select nextval('<sequence name>')
> from getIDMethodSQL
>
>- Removes the DEFAULT nextval(..) statement from the ID columns in table
> generation
>
>- Allows AUTO_INCREMENT columns in all tables, regardless whether the
> ID mode is native or not. A "serial" column is really an auto incrementing
> column, no matter how the id elements are generated.
>
> Please note that the current sql generation makes no sense. If we create
> or SQL with the DEFAULT nextval('sequence'), why restrict serials to tables
> that use native Ids? They don't use this anyway.
>
>
>This is the natural model of ID generation with PostgreSQL and is
>compatible to e.g. Hibernate (you can use a Hibernate Application and
>a Torque Application without any changes in the database schema.
>
>I'd really like to see this patch go in.
>
>
I don't have a problem with your patch, it works fine and it removes the
appearance of a problem highlighted in the faq:
http://wiki.apache.org/db-torque/PostgreSQLFAQ#head-e2be9dec72b7c31f16be1b4e70364e8190684691
>On a related note: I don't understand why the sequence creation has
>been dropped from the schema generation? They could be generated in a
>second file, but as it currently is, the regular sequence of "write a
>schema, torque:om, torque:sql, torque:create-db, torque:insert-sql,
>run your application" no longer works because the sequences are
>missing.
>
>I'd very much love to see this revision:
>http://cvs.apache.org/viewcvs.cgi/db-torque/src/generator/src/templates/sql/base/postgresql/table.vm?r1=1.1&r2=1.1.2.1&only_with_tag=TORQUE_3_1_BRANCH
>reverted.
>
>
What am I missing Henning? If my schema is:
<table name="category" idMethod="native">
<column name="category_id" required="true" autoIncrement="true"
primaryKey="true" type="INTEGER"/>
<column name="name" size="100" type="VARCHAR"/>
<id-method-parameter name="seqName" value="CATEGORY_CATEGORY_ID_SEQ"/>
</table>
The generated sql is:
-----------------------------------------------------------------------------
-- category
-----------------------------------------------------------------------------
DROP TABLE category CASCADE;
CREATE TABLE category
(
category_id serial,
name varchar (100),
PRIMARY KEY (category_id)
);
When this is executed PostgreSQL automatically creates a sequence (from
pgAdmin III):
CREATE SEQUENCE public.category_category_id_seq
INCREMENT 1
MINVALUE 1
MAXVALUE 9223372036854775807
START 1
CACHE 1;
The above sequence is automatically dropped when the table is dropped.
Woo hoo, everything works!
The trick is the following line in the schema:
<id-method-parameter name="seqName" value="CATEGORY_CATEGORY_ID_SEQ"/>
this is glossing over the fact that Table.getSequenceName() by default
(if no seqName is provided in the schema) returns just ${TABLE_NAME}_SEQ
(in my example CATEGORY_SEQ) rather than ${TABLE_NAME}_${COLUMN_NAME}_SEQ.
In my schemas I have always been specifying seqName for my PostgreSQL
sequences to get Torque to use the sequences that are automatically
created. The patch you refer to above was stopping Torque from
generating the code to manage the sequences because PostgreSQL is doing
this automatically - is it not doing this for you? Up until now I have
always had to remove the Torque generated sequence handling code because
it only caused problems.
My preference is to have the MapBuilder classes generated with sequence
names that match those used by the database, but to still allow this to
be overridden by seqName. As things currently stand this is not set up
to handle different values for different database systems and I am not
sure if there are any unwritten rules to make this easy to apply (e.g.
can only one primary key column be configured to be autoIncrement="true"?)
Or am I just totally missing something obvious?
Scott
--
Scott Eade
Backstage Technologies Pty. Ltd.
http://www.backstagetech.com.au
---------------------------------------------------------------------
To unsubscribe, e-mail: torque-dev-unsubscribe@db.apache.org
For additional commands, e-mail: torque-dev-help@db.apache.org