You are viewing a plain text version of this content. The canonical link for it is here.
Posted to torque-dev@db.apache.org by "Henning P. Schmiedehausen" <hp...@intermeta.de> on 2003/10/24 12:13:15 UTC

[PATCHES][Discussion wanted] My proposals from the last two weeks

Hi,

over the last two weeks, I added four patch proposals to the Torque
3.1 branch. While they're not exactly 3.1 material, I prefer to keep
them there, mainly because I didn't want to interfere with the ongoing
3.2-dev development (as you may have notices, time constraints have
forced me to go a little "underground" and ATM I cannot participate on
this list as actively as I might want to).

Here is a little summary of what these proposed patches do, why I
needed them and why I'd really would like to see them in the 3.2-dev
branch. And (of course) why anyone would benefit from them.

I have a (quite largeish) application which uses Torque as its back
end.  Up until now, I did loading of the Torque tables either with
generated SQL templates and the various native clients (psql, mysql,
you name it).  For this application, the customer requested "100% pure
Java". So I do need load / save tools written in Java.

The start was simply to hard code the table structures in my database.
As I was shooting on a moving targets (an "agile" process), this
proved to be very error prone after a short while. So I decided to
write a generic torque table load/save mechanism.

First problem was finding all the tables currently available in a
database.  As the peer classes only register with the Torque core if
they're used and just because a save program can only use tables that
it knows of, we had some sort of hen-egg problem.

As the database schema is the logical place to collect information
about all the tables, I added the mapbuilder-init proposal, which
enables the generator to create an additional class for a database
schema that can init all the table peers in a schema at a defined
time.

This is the least intrusive patch to torque and I really would like to
see it go in for both 3.1 and 3.2

Ok, now that I had my tables in place, I wanted to loop over them and
write out their objects. To be able to do so, I needed the names of
the attributes defined in the schema. This information is not
available with the current Database/Table/Column map. The second
proposal adds a new property to the ColumnMap objects, called
"JavaName" which then in turn gets set by the generated templates to
the actual java name for the bean property of this column.

This patch is quite intrusive. It adds a number of additional methods
to TableMap, a new attribute to ColumnMap and the generated peer
classes are no longer compatible to older Torque versions. (Older peer
classes _should_ work with Torque and this patch applied; didn't test,
however these classes won't benefit from the JavaName as it is not set
by the generated classes). This patch is definitely 3.2 material; I'd
like to see it go in early so we won't get a beta or an rc and then a
sudden breakage if this patch goes in late in the release cycle.

Alright, now I had my stuff in place and started to happily write out
my tables into XML (gzipped, base64ed but this is another story. ;-)
). And then my saver crashed. Because it encountered a table that is
"not like the other tables": ID_TABLE.

Well, for beginners, ID_TABLE didn't have a peer class or an torque
object.  I thought: "Well, easy, just run the supplied scheme through
torque:om. Worked, but the resulting class didn't compile. :-( Reason:
ID_TABLE contains a column called "TABLE_NAME". So the peer wants to
contain a constant called IdTablePeer.TABLE_NAME for the name of this
column. But all Peer classes contain a constant called "TABLE_NAME"
for the name of the table itself. Great design! Not.

Enter idtable-peer.patch. As I had to do some work on this anyway, I
decided to do it right. This patch reworks the build process of torque
(remember my question on maven-dev (which went unanswered. I expected
nothing else)?)  to incorporate a generator-built peer class for the
ID_TABLE and also renamed the column of the ID_TABLE to have ID_TABLE_
prefixed.

This patch is quite intrusive, too. While not exactly in terms of
Torque related java code (which should be unchanged), you must rebuild
your SQL statements from the generator (because the column names of
ID_TABLE have changed) and also, older Torque won't work together with
newer Torque on the same database (because of this change).

I added a migration script for my current most favourite database
(PostgreSQL), adding scripts for e.g. MySQL or Oracle should be as
easy (this is straightforward SQL that might even work with other
databases unchanged).

This patch is definitely 3.2 material; while it's not as intrusive as
the column-name patch, it does break backward compatibility and you
will have to rebuild your database after moving to Torque 3.2-dev.

Caveat: If you applied this patch and want to rebuild both, generator
and runtime, you _must_ build the generator first and the build the
runtime with this newly built generator! Make sure that your maven
picks up this newly built generator jar! If you build the runtime with
an old generator, you will get om classes for the ID_TABLE that don't
actually work!

Ok, now that I had a working "save", I started to work on
"load". Which was much simpler (leaving aside the fact that I had to
seriously force commons-digester to load my 1,5 MBytes XML
attributes...) but then I hit a brick wall:

PostgreSQL has a native sequencing scheme and the deploy database has
another, so we settled on using the idbroker for id generation.

Now, as we reloaded our saved data, of course all the tables and rows
do already have id information (and we have _lots_ of foreign keys
tying them together). When inserting this saved data back into the
database, all the id keys were lost and replaced with new ids from the
broker. None of the FKs were still valid (in fact: the database choked
at the first object that was inserted and referenced a foreign key).

So I needed another patch which allows an application to supply "stop,
don't do any new id generation, we already have all the information"
to the Torque core and insert an object "as is" into the database (No,
please don't suggest setNew(false). This would like to run doUpdate on
our object. However, we don't have anything in the database yet, so
there is nothing to update...).

This is the last patch: skipidbroker.patch. It adds a bunch of methods
to the Persistent interface and the BasePeer. For all save methods,
there is another one which an added boolean. If this boolean is true,
the object is inserted "as is" without using the idbroker.

All old save methods are kept and retain their normal operation (use
the id broker).

This patch is still intrusive as it changes the signatures of the
Persistent interface (and also the generated peer classes. Classes
generated with this patch applied might not work with older torque).

As I don't consider myself a core torque developer but mainly a "power
torque user with commit rights", I'd like to hear some opinions from
the core developers (Martin?) before putting these changes in. As I
stated above, after applying them, there will be a break in backward
compatibility, which is not easy to solve (especially the ID_TABLE
stuff).

I use all four patches with great success in our application and
personally, I'd benefit greatly from them.

Discussion wanted.

	Regards
		Henning

-- 
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen          INTERMETA GmbH
hps@intermeta.de        +49 9131 50 654 0   http://www.intermeta.de/

Java, perl, Solaris, Linux, xSP Consulting, Web Services 
freelance consultant -- Jakarta Turbine Development  -- hero for hire

"Dominate!! Dominate!! Eat your young and aggregate! I have grotty silicon!" 
      -- AOL CD when played backwards  (User Friendly - 200-10-15)

---------------------------------------------------------------------
To unsubscribe, e-mail: torque-dev-unsubscribe@db.apache.org
For additional commands, e-mail: torque-dev-help@db.apache.org


Re: [PATCHES][Discussion wanted] My proposals from the last two weeks

Posted by Scott Eade <se...@backstagetech.com.au>.
Henning,

I leave it to those more informed than I to review your proposals, but I 
thought I would throw the following into the mix as it may be an 
alternative use case for something like what you are working on.

I am in the process of migrating a Turbine/Torque webapp from MySQL to 
PostgreSQL - I have my app working using PostgreSQL, but now need to 
convert the existing data across (handling things such as MySQL's 
AUTO_INCREMENT vs PostgreSQL's SEQUENCE, date/time/timestamp 
differences, etc.).  In the process of migrating the data I have 
actually run into a few of the issues you are addressing, or at least 
variations on them.

Under MySQL I use AUTO_INCREMENT for all of my tables that have IDs, 
including the Turbine* tables (thanks to your TorqueUser code in Turbine 
2.3 this works brilliantly).

I couldn't see any free data migration tools so I decided to use Torque 
to do it for me.  Here is a brief summary of what I have done:

1. I generate two sets of Torque classes - one for MySQL and another for 
PostgreSQL.  These have different project names and packages so I can 
access them simultaneously from the same purpose built conversion 
application.  The PostgreSQL schema has its idMethod set to "none" (in 
the real application it is "native") so that when the conversion runs 
the ids allocated by MySQL are retained and thus all of the foreign key 
references can be successfully transferred (i.e. referential integrity 
is maintained).

2. The sql that Torque generates to create the PostgreSQL database from 
the true application schema (the one with idMethod="native") has to be 
manipulated to remove the sql that creates and drops the sequences since 
PostgreSQL 7.3 does this automatically (of course for the "real" app you 
also have to add id-method-parameter elements to the schema to set the 
name of the sequences correctly since Torque gets it wrong).  To manage 
the RI, I have moved the sql that defines the RI constraints into a 
separate file that I will run after the data has been loaded.

3. I manually create a new method doMigrateData() in the PostgreSQL peer 
classes, and invoke this for all tables in my schema.  The new method 
looks like this (whole class shown for context):

package com.backstagetech.cmes.postgresql.om;

import java.util.Iterator;
import java.util.List;

import org.apache.torque.TorqueException;
import org.apache.torque.util.BasePeer;
import org.apache.torque.util.Criteria;

public class TurbineGroupPeer
    extends com.backstagetech.cmes.postgresql.om.BaseTurbineGroupPeer
{
    /**
     * Migrate the record from MySQL to PostgreSQL
     *
     * To update for a new class:
     * 1. Replace TurbineGroup with the new class name.
     * 2. Replace The primary key (GROUP_ID) with the one for the new 
class.
     */
    public static void doMigrateData() throws TorqueException
    {
        List mysqlRecords
                = com.backstagetech.cmes.mysql.om.TurbineGroupPeer
                        .doSelectVillageRecords(new Criteria());
           
        List postgresqlObjects
                = TurbineGroupPeer.populateObjects(mysqlRecords);
           
        for (Iterator iter = postgresqlObjects.iterator(); iter.hasNext();)
        {
            TurbineGroup postgresqlObject = (TurbineGroup) iter.next();
               
            postgresqlObject.setNew(true);
            postgresqlObject.setModified(true);
            postgresqlObject.save();
        }
           
        // The following is only necessary when the table has an 
autoIncrement primary key.
        String serialQuery = "SELECT setval('"
                + ((String) 
TurbineGroupPeer.getTableMap().getPrimaryKeyMethodInfo()).toLowerCase()
                + "', max("
            + TurbineGroupPeer.GROUP_ID
                + ")) FROM " + TurbineGroupPeer.TABLE_NAME;
        BasePeer.executeQuery(serialQuery);
    }
}
Using an eclipse template this is really easy to create - as this is 
only a one-off conversion I have not really considered if this is an 
optimal solution nor how it might be automated.

4. After execution I should be able to run the sql that enables the RI 
rules, switch back to my webapp and start using PostgreSQL.  I say 
"should" because I am in the middle of this now - so far the data looks 
good.

Torque actually seems to handle this task quite well and for the limited 
one-off conversion I need the above seems to work fine.

Scott

-- 
Scott Eade
Backstage Technologies Pty. Ltd.
http://www.backstagetech.com.au





---------------------------------------------------------------------
To unsubscribe, e-mail: torque-dev-unsubscribe@db.apache.org
For additional commands, e-mail: torque-dev-help@db.apache.org