You are viewing a plain text version of this content. The canonical link for it is here.

Posted to torque-dev@db.apache.org by Scott Eade <se...@backstagetech.com.au> on 2004/09/12 16:23:28 UTC

Re: PostgreSQL ID generation

[Delayed response - several reasons...]

Henning P. Schmiedehausen wrote:

>... must think some more.
>
A few things strike me:

   1. The schema reference (included in the Generator section of the
      Torque site) is by no means complete and is scant on detail given
      how important it is in the scheme of things (no pun intended). 
      This document should provide a decent specification of the effect
      the various elements and attributes will have on the generated sql
      and OM code rather than being a partial reference used to assist
      with a trial and error decoding process.
   2. PostgreSQL has a serial data type that Torque already supports to
      a limited extent.  IMHO we should fully embrace this by:
         1. Generating schema sql that used the SERIAL datatype rather
            than "default nextval('foo_bar_seq')" (the latter with a
            different sequence name structure is what PostgreSQL
            maintains automatically when the SERIAL datatype is used).
         2. Leaving sequence naming (with the available override),
            sequence creation and destruction (drop) to PostgreSQL.
         3. Enhancing the JDBC target to detect the use of sequences so
            as to produce schema files that use SERIAL.
         4. Determine whether or not the id-method-parameter element has
            any use.
   3. The column elements autoIncrement attribute in the schema is not
      clearly defined.  Correct me if I am wrong, but from your comments
      and your patch it appears that you interpret a combination of
      idMethod="native" for the table and autoIncrement="true" on a
      column to be the triggers for the use of a sequence.  I actually
      thought that primaryKey="true" had something to do with it and
      from the generated code this does in fact appear to be the case.

    Generate the sql for the following schema (comments on the resultant
    sql are included within):


  <!-- idbroker with autoIncrement=false on PK -->
  <!-- Resultant sql seems okay, not sure how IDBroker would behave. -->
  <table name="dog_idbroker" idMethod="idbroker">
    <column name="dog_id" required="true" autoIncrement="false"
        primaryKey="true" type="INTEGER"/>
    <column name="animal_id" required="true" autoIncrement="false"
        primaryKey="false" type="INTEGER"/>
    <column name="name" size="100" type="VARCHAR"/>
  </table>

  <!-- native with autoIncrement=false on PK -->
  <!-- Resultant sql includes:
            dog_id integer DEFAULT nextval('dog_native_SEQ') NOT NULL
       Should be:
            dog_id integer NOT NULL
   -->
  <table name="dog_native" idMethod="native">
    <column name="dog_id" required="true" autoIncrement="false"
        primaryKey="true" type="INTEGER"/>
    <column name="animal_id" required="true" autoIncrement="false"
        primaryKey="false" type="INTEGER"/>
    <column name="name" size="100" type="VARCHAR"/>
    <!--id-method-parameter name="seqName" value="DOG_NATIVE_DOG_ID_SEQ"/-->
  </table>

  <!-- idbroker with autoIncrement=true on PK -->
  <!-- Resultant sql includes:
            cat_id
       Should be:
            cat_id integer NOT NULL
   -->
  <table name="cat_idbroker" idMethod="idbroker">
    <column name="cat_id" required="true" autoIncrement="true"
        primaryKey="true" type="INTEGER"/>
    <column name="animal_id" required="true" autoIncrement="false"
        primaryKey="false" type="INTEGER"/>
    <column name="name" size="100" type="VARCHAR"/>
  </table>

  <!-- native with autoIncrement=true on PK -->
  <!-- Resultant sql includes (IMO correct):
            cat_id serial
   -->
  <table name="cat_native" idMethod="native">
    <column name="cat_id" required="true" autoIncrement="true"
        primaryKey="true" type="INTEGER"/>
    <column name="animal_id" required="true" autoIncrement="false"
        primaryKey="false" type="INTEGER"/>
    <column name="name" size="100" type="VARCHAR"/>
    <!--id-method-parameter name="seqName" value="CAT_NATIVE_CAT_ID_SEQ"/-->
  </table>

  <!-- idbroker with autoIncrement=true on non-PK -->
  <!-- Resultant sql includes:
            mouse_id integer NOT NULL,
            animal_id
       Should be:
             mouse_id integer NOT NULL,
            animal_id integer NOT NULL
  -->
  <table name="mouse_idbroker" idMethod="idbroker">
    <column name="mouse_id" required="true" autoIncrement="false"
        primaryKey="true" type="INTEGER"/>
    <column name="animal_id" required="true" autoIncrement="true"
        primaryKey="false" type="INTEGER"/>
    <column name="name" size="100" type="VARCHAR"/>
  </table>

  <!-- native with autoIncrement=false on non-PK -->
  <!-- Resultant sql includes:
            mouse_id integer DEFAULT nextval('mouse_native_SEQ') NOT NULL,
            animal_id serial
       Should be:
            mouse_id integer NOT NULL,
            animal_id serial
   -->
  <table name="mouse_native" idMethod="native">
    <column name="mouse_id" required="true" autoIncrement="false"
        primaryKey="true" type="INTEGER"/>
    <column name="animal_id" required="true" autoIncrement="true"
        primaryKey="false" type="INTEGER"/>
    <column name="name" size="100" type="VARCHAR"/>
    <!--id-method-parameter name="seqName" 
value="MOUSE_NATIVE_MOUSE_ID_SEQ"/-->
  </table>

We could update Torque to consistently use the SERIAL datatype, and to 
only so this when autoIncrement is true.  In addition to this it would 
be great if we could provide a means of specifying the sequence name for 
the particular column it is use on, perhaps seqName should be a column 
attribute (and perhaps an option to determine whether or not the 
sequence should be created and dropped would help out people with 
existing sequences).

Basically as it stands it is a right mess and needs to be addressed 
before the next RC.

Scott

-- 
Scott Eade
Backstage Technologies Pty. Ltd.
http://www.backstagetech.com.au


---------------------------------------------------------------------
To unsubscribe, e-mail: torque-dev-unsubscribe@db.apache.org
For additional commands, e-mail: torque-dev-help@db.apache.org

Re: PostgreSQL ID generation

Posted by "Henning P. Schmiedehausen" <hp...@intermeta.de>.

Scott Eade <se...@backstagetech.com.au> writes:

>   2. PostgreSQL has a serial data type that Torque already supports to

Yep.

>      a limited extent.  IMHO we should fully embrace this by:
>         1. Generating schema sql that used the SERIAL datatype rather
>            than "default nextval('foo_bar_seq')" (the latter with a
>            different sequence name structure is what PostgreSQL
>            maintains automatically when the SERIAL datatype is used).
>         2. Leaving sequence naming (with the available override),
>            sequence creation and destruction (drop) to PostgreSQL.
>         3. Enhancing the JDBC target to detect the use of sequences so
>            as to produce schema files that use SERIAL.
>         4. Determine whether or not the id-method-parameter element has
>            any use.

The problem with the torque schema and postgresql is, that there are many
different ways to generate an ID here:

- Using ID_TABLE: using a regular number type like INTEGER or BIGINT
  as index, generating the ID with Torque

- Using a serial datatype: works like MySQL AUTO_INCREMENT, so the ID
  generator must fetch the index column after an insert

- Using default nextval('foo_bar_seq'): Uses a Sequence but manages
  the id insert process inside the database engine, the ID generator
  must fetch the index column after an insert. From a database point of
  view, this is the same as "auto_increment".

- Use a sequence and let the ID generator fetch the next value from the
  sequence and then insert it into the table (sequence generator)

The last one is compatible to the Hibernate "sequence" scheme.
                                     
>   3. The column elements autoIncrement attribute in the schema is not
>      clearly defined.  Correct me if I am wrong, but from your comments
>      and your patch it appears that you interpret a combination of
>      idMethod="native" for the table and autoIncrement="true" on a
>      column to be the triggers for the use of a sequence.  I actually

The Sequence Generator which does "select nextval('sequence_name')"
pulls the ID from the sequence and then inserts it together with the
data just like the ID_TABLE generator does. So the ID column itself is
just a regular number type without any special properties (like serial or
default value).

>      thought that primaryKey="true" had something to do with it and
>      from the generated code this does in fact appear to be the case.

>    Generate the sql for the following schema (comments on the resultant
>    sql are included within):


>  <!-- idbroker with autoIncrement=false on PK -->
>  <!-- Resultant sql seems okay, not sure how IDBroker would behave. -->
>  <table name="dog_idbroker" idMethod="idbroker">
>    <column name="dog_id" required="true" autoIncrement="false"
>        primaryKey="true" type="INTEGER"/>
>    <column name="animal_id" required="true" autoIncrement="false"
>        primaryKey="false" type="INTEGER"/>
>    <column name="name" size="100" type="VARCHAR"/>
>  </table>

Yes.

>  <!-- native with autoIncrement=false on PK -->
>  <!-- Resultant sql includes:
>            dog_id integer DEFAULT nextval('dog_native_SEQ') NOT NULL
>       Should be:
>            dog_id integer NOT NULL

Yes.

>   -->
>  <table name="dog_native" idMethod="native">
>    <column name="dog_id" required="true" autoIncrement="false"
>        primaryKey="true" type="INTEGER"/>
>    <column name="animal_id" required="true" autoIncrement="false"
>        primaryKey="false" type="INTEGER"/>
>    <column name="name" size="100" type="VARCHAR"/>
>    <!--id-method-parameter name="seqName" value="DOG_NATIVE_DOG_ID_SEQ"/-->
>  </table>

>  <!-- idbroker with autoIncrement=true on PK -->
>  <!-- Resultant sql includes:
>            cat_id
>       Should be:
>            cat_id integer NOT NULL

If you want auto_increment, then this should probably be some sort of
serial type. This will conflict with the ID_BROKER.

[...]

>We could update Torque to consistently use the SERIAL datatype, and to 
>only so this when autoIncrement is true.  In addition to this it would 

That is a good idea. However, we must check first, whether SERIAL is
allowed for all number column types (don't know, never needed SERIAL
before).

>be great if we could provide a means of specifying the sequence name for 
>the particular column it is use on, perhaps seqName should be a column 
>attribute (and perhaps an option to determine whether or not the 
>sequence should be created and dropped would help out people with 
>existing sequences).

>Basically as it stands it is a right mess and needs to be addressed 
>before the next RC.

The problem is, that the different ID generators (auto_increment,
sequence, id_table) have different semantics on when to fetch the ID
values. So for postgresql, these must generate different column types.

idbroker, sequence  ----> regular number type NOT NULL (e.g. BIGINT NOT NULL)
auto_increment      ----> regular number type SERIAL

(SERIAL is translated internally to 

CREATE SEQUENCE tablename_colname_seq;
CREATE TABLE tablename (
    colname integer DEFAULT nextval('tablename_colname_seq') NOT NULL
);

according to http://www.postgresql.org/docs/7.3/static/datatype.html#DATATYPE-SERIAL)

when using "sequence", we should also create the sequence in the SQL. 

What is left is to find out what "native" should really be. IMHO it
should be equivalent "sequence" because "auto_increment" creates
implicit object in the database outside user control. This is the patch
that I've put on the list.

There is also a note with "SERIAL":

--- cut --- http://www.postgresql.org/docs/7.3/static/datatype.html#DATATYPE-SERIAL --- cut ---
Note: Prior to PostgreSQL 7.3, serial implied UNIQUE. This is no
longer automatic. If you wish a serial column to be UNIQUE or a
PRIMARY KEY it must now be specified, just as with any other data
type.
--- cut --- http://www.postgresql.org/docs/7.3/static/datatype.html#DATATYPE-SERIAL --- cut ---

so we might need to build a switch for pre-7.3 PostgreSQL databases.

	Regards
		Henning

-- 
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen          INTERMETA GmbH
hps@intermeta.de        +49 9131 50 654 0   http://www.intermeta.de/

RedHat Certified Engineer -- Jakarta Turbine Development  -- hero for hire
   Linux, Java, perl, Solaris -- Consulting, Training, Development

"Fighting for one's political stand is an honorable action, but re-
 fusing to acknowledge that there might be weaknesses in one's
 position - in order to identify them so that they can be remedied -
 is a large enough problem with the Open Source movement that it
 deserves to be on this list of the top five problems."
                       -- Michelle Levesque, "Fundamental Issues with
                                    Open Source Software Development"

---------------------------------------------------------------------
To unsubscribe, e-mail: torque-dev-unsubscribe@db.apache.org
For additional commands, e-mail: torque-dev-help@db.apache.org