You are viewing a plain text version of this content. The canonical link for it is here.
Posted to ddlutils-user@db.apache.org by Thomas Dudziak <to...@gmail.com> on 2006/02/04 18:39:50 UTC

Re: oracle database model dangerously broken

I've investigated this, and in fact DdlUtils is behaving can be
expected. The reason is this:

DdlUtils can only fully support schemas in the database that is has
generated. The reason is simply that databases offer a whole lot more
than what DdlUtils can cover.
One aspect of this is that for most if not all databases, DdlUtils
actively supports only a subset of the native types that the database
has to offer. In the case of Oracle, (LONG) RAW is not one of the
supported types (mainly because Oracle discourages from using them in
favor of BLOB).
So while DdlUtils is able to read a table with a LONG RAW column
(which the JDBC driver reports as LONGVARBINARY), DdlUtils will treat
LONGVARBINARY as BLOB because Oracle has no dedicated LONGVARBINARY
type.
That is, if you create a database via DdlUtils and specify
LONGVARBINARY, you'll get a BLOB in the database. Now when you read
this back, the read column will be of type BLOB. And DdlUtils now
ensures that the column won't be changed when altering something other
in the database.

This may sound a bit complicated, but in the end this serves to
support the following workflow:

* create db via DdlUtils
* change something in the db model via DdlUtils => DdlUtils ensures
that as few changes as possible will be made to the db

The crucial thing is that this may conflict with the workflow that you've tried:

* create db outside of DdlUtils
* read the model via DdlUtils
* change something in the db model via DdlUtils

DdlUtils can only fully support one of these workflows, and IMO the
first one is more useful, so that's what DdlUtils focuses on. We're
trying hard to use the native types that the database vendors suggest,
so that even the second workflow works most of the time, but only if
the tables use these suggested types. E.g. if you'd use BLOB instead
of LONG RAW, DdlUtils would not try to change the column.

Btw, for Oracle it is advisable to specify the platform manually. E.g.
I've changed your code to:

OracleDataSource ods = new OracleDataSource();

ods.setURL("jdbc:oracle:thin:@localhost:1521:orcl");
ods.setUser("my_user");
ods.setPassword("my_password");

// note the change to BLOB here
String sql = "CREATE TABLE MY_TABLE( " +
               "prop_name VARCHAR2(200) NOT NULL PRIMARY KEY, " +
               "prop_value VARCHAR2(200), prop_value_ext BLOB)";

Connection conn = ods.getConnection();
Statement stmt = conn.createStatement();

stmt.execute(sql);
stmt.close();
conn.close();

String schema = "MY_SCHEMA";

// I've tested against an Oracle 10 database
Platform platform = PlatformFactory.createNewPlatformInstance("Oracle10");

platform.setDataSource(ods);
// the table was created without delimiters, so we should use DdlUtils
in the same way
platform.getPlatformInfo().setUseDelimitedIdentifiers(false);

Database db = platform.readModelFromDatabase("test", null, schema, null);

dumpDb(db);

// note the new arguments for catalog, schema, table types
System.out.println(platform.getAlterTablesSql(null, schema, null, db,
true, true, true));


Tom

Re: oracle database model dangerously broken

Posted by Thomas Dudziak <to...@gmail.com>.
On 2/7/06, Tim Dudgeon <td...@informaticsmatters.com> wrote:

> By this do you mean abstracting the issue of  ID generation? e.g .
> if oracle -> use a sequcnce
> if hsql -> use integer IDENTITY column
> if mysql -> use integer autoincrement

No (DdlUtils already does this in exactly the way you described).
Rather I was speaking of the ability to define a sequence in the
schema a database independent manner (for whatever purpose) and
DdlUtils will then write it to the database (if supported). Same could
probably be said for triggers and perhaps even stored procedures.
Likewise, DdlUtils should - as far as possible - read sequence
definitions from a database when generating a schema from a live
database.

> Reading it from an exising db would be trickier - in oracle I don't
> think the sequence is not specifically associated to the table it is
> being used by, so DDLUtils would need some hints?

Correct. Right now, DdlUtils cannot read sequences or triggers back
from an Oracle database.

Tom

Re: oracle database model dangerously broken

Posted by Tim Dudgeon <td...@informaticsmatters.com>.
Thomas Dudziak wrote:

>And there certainly is room for improvement
>in DdlUtils. E.g. it would be useful to have the ability to specify
>sequences in an database-independent way.
>
>  
>
By this do you mean abstracting the issue of  ID generation? e.g . 
if oracle -> use a sequcnce
if hsql -> use integer IDENTITY column
if mysql -> use integer autoincrement
....

If so, then I would soundly endorse this.
Creating this in a db independent manner shoudl be reasonably straight 
fowrard.
Reading it from an exising db would be trickier - in oracle I don't 
think the sequence is not specifically associated to the table it is 
being used by, so DDLUtils would need some hints?


Tim

Re: oracle database model dangerously broken

Posted by Thomas Dudziak <to...@gmail.com>.
On 2/7/06, Craig L Russell <Cr...@sun.com> wrote:

> Thanks for bearing with me here. I realize I'm raising use-cases that
> are not in the DdlUtils sweet spot perhaps; however, these cases are
> very important for the object-relational space I'm targeting.

Of course you're welcome to help making DdlUtils better (and more versatile) :-)

> Not really. If I want to use the Oracle database schema in a number
> of different use-cases, I want a library that gives me access to the
> specific column types. It's not portable across Oracle and Informix;
> it's portable across execution environments of my use-cases.

Ah, now I see where you're heading. Of course not even two Oracle
environments are alike (you might say different levels of
compatibility). But you're right, that might be useful, though not
really what DdlUtils is about currently.
I guess a simple change as a native type hint specifiable at a column
might already go some way in that direction.

> Right, especially if the user can tell us how big their Strings are
> so we can map to the correct abstract database type. But I also want
> the user to be able to tell us about a specific Oracle type in case
> they know they want Oracle and don't care about database portability.

Yep, native type hint.

> I agree that for this use-case, you would need a transformation from
> the Oracle types to the other database types, and do this
> transformation during a "copy schema" operation that would take the
> Oracle representation and apply some transformation rules to produce
> the other database format. I would see this transformer program as a
> utility not as a DdlUtils core feature.

Sure, and if only because the JDBC types are probably too limited for this.

Tom

Re: oracle database model dangerously broken

Posted by Craig L Russell <Cr...@Sun.COM>.
Hi Thomas,

Thanks for bearing with me here. I realize I'm raising use-cases that  
are not in the DdlUtils sweet spot perhaps; however, these cases are  
very important for the object-relational space I'm targeting.

On Feb 6, 2006, at 2:59 PM, Thomas Dudziak wrote:

> On 2/6/06, Craig L Russell <Cr...@sun.com> wrote:
>
>> Well, this describes my app and I guess I see it a bit differently.
>> I'd like to use DdlUtils to capture any database schema recognizing
>> that the full round trip development of schema is not supported.
>> Really all I want here is accurate representation of the actual  
>> schema.
>>
>> It's still of great value to be able to point DdlUtils at a database
>> and get its database-specific representation into a portable format
>> and to be able to use this format to do such things as generation of
>> POJO Java classes from the database. If we can't get the actual
>> database representation, then it's of much less value for this
>> application.
>
> Mhmm, I'm sorry but I think I still don't really understand what
> you're saying (just got back from the movies so I'm not fully here
> yet).
> Ain't database-specific representation and portability contradicting ?

Not really. If I want to use the Oracle database schema in a number  
of different use-cases, I want a library that gives me access to the  
specific column types. It's not portable across Oracle and Informix;  
it's portable across execution environments of my use-cases.

> Of course, I could try to read the exact schema using Oracle's sql
> plus and encode that into an XML file. But what is the benefit ? I can
> only use this with an Oracle database (and likely only with certain
> versions) so it isn't portable.
> On the other hand, having a portable version (e.g. in terms of JDBC)
> looses detail that might or might not be important to the application.
> This is why DdlUtils currently focuses on database creation.

Ah, your sweet spot. And one of my use-cases is taking an object  
model defined with Java types and using DdlUtils to define the schema  
so it's portable across different databases.

> If the
> developer has no problems defining the schema in terms of JDBC types
> and constraints, then there are no database specific details to loose.

Right, especially if the user can tell us how big their Strings are  
so we can map to the correct abstract database type. But I also want  
the user to be able to tell us about a specific Oracle type in case  
they know they want Oracle and don't care about database portability.
>
> I can see the benefit of being able to read a schema from an existing
> database and to transfer that to another database. But for databases
> that contain database specifics, full detail preservation can IMO only
> be achieved if the databases are the same and a database-specific
> format is used (i.e. SQL something similar produced by a dump).

I agree that for this use-case, you would need a transformation from  
the Oracle types to the other database types, and do this  
transformation during a "copy schema" operation that would take the  
Oracle representation and apply some transformation rules to produce  
the other database format. I would see this transformer program as a  
utility not as a DdlUtils core feature.

> When transfering to a different database product, you're bound to
> loose some detail. Of course, the shared feature set is likely larger
> than what JDBC provides. And there certainly is room for improvement
> in DdlUtils. E.g. it would be useful to have the ability to specify
> sequences in an database-independent way.

I see great value in this kind of behavior as well.

Regards,

Craig
>
> regards,
> Tom

Craig Russell
Architect, Sun Java Enterprise System http://java.sun.com/products/jdo
408 276-5638 mailto:Craig.Russell@sun.com
P.S. A good JDO? O, Gasp!


Re: oracle database model dangerously broken

Posted by Thomas Dudziak <to...@gmail.com>.
On 2/6/06, Craig L Russell <Cr...@sun.com> wrote:

> Well, this describes my app and I guess I see it a bit differently.
> I'd like to use DdlUtils to capture any database schema recognizing
> that the full round trip development of schema is not supported.
> Really all I want here is accurate representation of the actual schema.
>
> It's still of great value to be able to point DdlUtils at a database
> and get its database-specific representation into a portable format
> and to be able to use this format to do such things as generation of
> POJO Java classes from the database. If we can't get the actual
> database representation, then it's of much less value for this
> application.

Mhmm, I'm sorry but I think I still don't really understand what
you're saying (just got back from the movies so I'm not fully here
yet).
Ain't database-specific representation and portability contradicting ?
Of course, I could try to read the exact schema using Oracle's sql
plus and encode that into an XML file. But what is the benefit ? I can
only use this with an Oracle database (and likely only with certain
versions) so it isn't portable.
On the other hand, having a portable version (e.g. in terms of JDBC)
looses detail that might or might not be important to the application.
This is why DdlUtils currently focuses on database creation. If the
developer has no problems defining the schema in terms of JDBC types
and constraints, then there are no database specific details to loose.

I can see the benefit of being able to read a schema from an existing
database and to transfer that to another database. But for databases
that contain database specifics, full detail preservation can IMO only
be achieved if the databases are the same and a database-specific
format is used (i.e. SQL something similar produced by a dump).
When transfering to a different database product, you're bound to
loose some detail. Of course, the shared feature set is likely larger
than what JDBC provides. And there certainly is room for improvement
in DdlUtils. E.g. it would be useful to have the ability to specify
sequences in an database-independent way.

regards,
Tom

Re: oracle database model dangerously broken

Posted by Craig L Russell <Cr...@Sun.COM>.
Hi Thomas,

On Feb 6, 2006, at 2:25 PM, Thomas Dudziak wrote:

> On 2/6/06, Craig L Russell <Cr...@sun.com> wrote:
>
>> I think that it would be more useful if DdlUtils distinguished
>> between the type actually stored in the database versus the mapping
>> from the abstract type to the actual type.
>>
>> Specifically, I'd like to see it be able to know the difference
>> between a column defined as LONG RAW and BLOB, since Oracle treats
>> them as different. If the user wants to define a real column type
>> they should be able to use either LONG RAW or BLOB. If the user just
>> wants an abstract column type LONGVARBINARY, then I have no problem
>> with DdlUtils creating a BLOB by default (if the user doesn't
>> override the generated column type with a specific type).
>>
>> I haven't looked closely enough into the implications of this, but I
>> have worked with column types on many projects and it is generally
>> useful to separate the actual column type from the generated column
>> type based on an abstract type.
>>
>> Another example is the abstract type String with a length. Databases
>> have different names for various lengths, e.g. VARCHAR, VARCHAR2,
>> CLOB. So the type for a String-6000 will be different for different
>> databases. But the actual column type should always be available to
>> the user of the API.
>
> Craig, I'm not exactly sure what you mean. If you're implying that the
> user should have (optional) control over the actual native type that
> is being used for a column, then I agree. This is something definitely
> planned, both for the API and the Ant tasks, though it is not targeted
> for the 1.0.

Yes, this is what I mean.
>
> However, it is the intention of DdlUtils to give users the ability to
> define the database model completely in terms of JDBC types and
> constraints, and then it just works.

Cool.

> Of course, this won't suffice for
> all applications, but it does not have to. Those apps that rely on
> specific features of a database (e.g. specific datatypes, stored
> procedures, triggers, ...) are bound to this database in any case, and
> so they would not want to use DdlUtils in the first place.

Well, this describes my app and I guess I see it a bit differently.  
I'd like to use DdlUtils to capture any database schema recognizing  
that the full round trip development of schema is not supported.  
Really all I want here is accurate representation of the actual schema.

It's still of great value to be able to point DdlUtils at a database  
and get its database-specific representation into a portable format  
and to be able to use this format to do such things as generation of  
POJO Java classes from the database. If we can't get the actual  
database representation, then it's of much less value for this  
application.

Regards,

Craig
>
> regards,
> Tom

Craig Russell
Architect, Sun Java Enterprise System http://java.sun.com/products/jdo
408 276-5638 mailto:Craig.Russell@sun.com
P.S. A good JDO? O, Gasp!


Re: oracle database model dangerously broken

Posted by Thomas Dudziak <to...@gmail.com>.
On 2/6/06, Craig L Russell <Cr...@sun.com> wrote:

> I think that it would be more useful if DdlUtils distinguished
> between the type actually stored in the database versus the mapping
> from the abstract type to the actual type.
>
> Specifically, I'd like to see it be able to know the difference
> between a column defined as LONG RAW and BLOB, since Oracle treats
> them as different. If the user wants to define a real column type
> they should be able to use either LONG RAW or BLOB. If the user just
> wants an abstract column type LONGVARBINARY, then I have no problem
> with DdlUtils creating a BLOB by default (if the user doesn't
> override the generated column type with a specific type).
>
> I haven't looked closely enough into the implications of this, but I
> have worked with column types on many projects and it is generally
> useful to separate the actual column type from the generated column
> type based on an abstract type.
>
> Another example is the abstract type String with a length. Databases
> have different names for various lengths, e.g. VARCHAR, VARCHAR2,
> CLOB. So the type for a String-6000 will be different for different
> databases. But the actual column type should always be available to
> the user of the API.

Craig, I'm not exactly sure what you mean. If you're implying that the
user should have (optional) control over the actual native type that
is being used for a column, then I agree. This is something definitely
planned, both for the API and the Ant tasks, though it is not targeted
for the 1.0.

However, it is the intention of DdlUtils to give users the ability to
define the database model completely in terms of JDBC types and
constraints, and then it just works. Of course, this won't suffice for
all applications, but it does not have to. Those apps that rely on
specific features of a database (e.g. specific datatypes, stored
procedures, triggers, ...) are bound to this database in any case, and
so they would not want to use DdlUtils in the first place.

regards,
Tom

Re: oracle database model dangerously broken

Posted by Craig L Russell <Cr...@Sun.COM>.
Hi Thomas,

On Feb 4, 2006, at 9:39 AM, Thomas Dudziak wrote:

> I've investigated this, and in fact DdlUtils is behaving can be
> expected. The reason is this:
>
> DdlUtils can only fully support schemas in the database that is has
> generated. The reason is simply that databases offer a whole lot more
> than what DdlUtils can cover.
> One aspect of this is that for most if not all databases, DdlUtils
> actively supports only a subset of the native types that the database
> has to offer. In the case of Oracle, (LONG) RAW is not one of the
> supported types (mainly because Oracle discourages from using them in
> favor of BLOB).

I think that it would be more useful if DdlUtils distinguished  
between the type actually stored in the database versus the mapping  
from the abstract type to the actual type.

Specifically, I'd like to see it be able to know the difference  
between a column defined as LONG RAW and BLOB, since Oracle treats  
them as different. If the user wants to define a real column type  
they should be able to use either LONG RAW or BLOB. If the user just  
wants an abstract column type LONGVARBINARY, then I have no problem  
with DdlUtils creating a BLOB by default (if the user doesn't  
override the generated column type with a specific type).

I haven't looked closely enough into the implications of this, but I  
have worked with column types on many projects and it is generally  
useful to separate the actual column type from the generated column  
type based on an abstract type.

Another example is the abstract type String with a length. Databases  
have different names for various lengths, e.g. VARCHAR, VARCHAR2,  
CLOB. So the type for a String-6000 will be different for different  
databases. But the actual column type should always be available to  
the user of the API.

Just a couple of pennies thought,

Craig

> So while DdlUtils is able to read a table with a LONG RAW column
> (which the JDBC driver reports as LONGVARBINARY), DdlUtils will treat
> LONGVARBINARY as BLOB because Oracle has no dedicated LONGVARBINARY
> type.
> That is, if you create a database via DdlUtils and specify
> LONGVARBINARY, you'll get a BLOB in the database. Now when you read
> this back, the read column will be of type BLOB. And DdlUtils now
> ensures that the column won't be changed when altering something other
> in the database.
>
> This may sound a bit complicated, but in the end this serves to
> support the following workflow:
>
> * create db via DdlUtils
> * change something in the db model via DdlUtils => DdlUtils ensures
> that as few changes as possible will be made to the db
>
> The crucial thing is that this may conflict with the workflow that  
> you've tried:
>
> * create db outside of DdlUtils
> * read the model via DdlUtils
> * change something in the db model via DdlUtils
>
> DdlUtils can only fully support one of these workflows, and IMO the
> first one is more useful, so that's what DdlUtils focuses on. We're
> trying hard to use the native types that the database vendors suggest,
> so that even the second workflow works most of the time, but only if
> the tables use these suggested types. E.g. if you'd use BLOB instead
> of LONG RAW, DdlUtils would not try to change the column.
>
> Btw, for Oracle it is advisable to specify the platform manually. E.g.
> I've changed your code to:
>
> OracleDataSource ods = new OracleDataSource();
>
> ods.setURL("jdbc:oracle:thin:@localhost:1521:orcl");
> ods.setUser("my_user");
> ods.setPassword("my_password");
>
> // note the change to BLOB here
> String sql = "CREATE TABLE MY_TABLE( " +
>                "prop_name VARCHAR2(200) NOT NULL PRIMARY KEY, " +
>                "prop_value VARCHAR2(200), prop_value_ext BLOB)";
>
> Connection conn = ods.getConnection();
> Statement stmt = conn.createStatement();
>
> stmt.execute(sql);
> stmt.close();
> conn.close();
>
> String schema = "MY_SCHEMA";
>
> // I've tested against an Oracle 10 database
> Platform platform = PlatformFactory.createNewPlatformInstance 
> ("Oracle10");
>
> platform.setDataSource(ods);
> // the table was created without delimiters, so we should use DdlUtils
> in the same way
> platform.getPlatformInfo().setUseDelimitedIdentifiers(false);
>
> Database db = platform.readModelFromDatabase("test", null, schema,  
> null);
>
> dumpDb(db);
>
> // note the new arguments for catalog, schema, table types
> System.out.println(platform.getAlterTablesSql(null, schema, null, db,
> true, true, true));
>
>
> Tom

Craig Russell
Architect, Sun Java Enterprise System http://java.sun.com/products/jdo
408 276-5638 mailto:Craig.Russell@sun.com
P.S. A good JDO? O, Gasp!


Re: oracle database model dangerously broken

Posted by Thomas Dudziak <to...@gmail.com>.
On 2/6/06, Tim Dudgeon <td...@informaticsmatters.com> wrote:

> thanks for that info. I understand your explaination, but doesn't this
> significantly limit the use of DDLUtils?
> If DDLUtils can make uncalled for changes to any database that has been
> created, or modified, by anything other than DDLUtils, then that's quite
> a high risk to accept?

Its not so much limiting but ruther a matter of focus. In its current
version DdlUtils primarily intends to allow Java applications to
create and initialize databases without having to care about the
actual database and its native types, SQL structure, etc.

One particular point is that DdlUtils must ensure that the native type
that a JDBC type is mapped to, is suited (large enough, the driver
accepts the corresponding Java type etc.). The problem now is that
when reading back a model, DdlUtils has to rely on the JDBC driver's
implementation of the database metadata. And if the driver decides
that multiple types map to the same JDBC type (think Oracle which maps
numeric types back to JDBC type DECIMAL), then we're in trouble
because DdlUtils has no way to distinguish between them (perhaps only
via the size as DdlUtils does in the case of Oracle) and hence
problems as in your case are bound to arise.

That being said, it would surely be possible to invest more work and
make this more robust. But this means a lot of work, and thus cannot
be a target for the 1.0 release which should come out
real-soon-now(tm).

Tom

Re: oracle database model dangerously broken

Posted by Tim Dudgeon <td...@informaticsmatters.com>.
Hi Tom,

thanks for that info. I understand your explaination, but doesn't this 
significantly limit the use of DDLUtils?
If DDLUtils can make uncalled for changes to any database that has been 
created, or modified, by anything other than DDLUtils, then that's quite 
a high risk to accept?

Tim



Thomas Dudziak wrote:

>I've investigated this, and in fact DdlUtils is behaving can be
>expected. The reason is this:
>
>DdlUtils can only fully support schemas in the database that is has
>generated. The reason is simply that databases offer a whole lot more
>than what DdlUtils can cover.
>One aspect of this is that for most if not all databases, DdlUtils
>actively supports only a subset of the native types that the database
>has to offer. In the case of Oracle, (LONG) RAW is not one of the
>supported types (mainly because Oracle discourages from using them in
>favor of BLOB).
>So while DdlUtils is able to read a table with a LONG RAW column
>(which the JDBC driver reports as LONGVARBINARY), DdlUtils will treat
>LONGVARBINARY as BLOB because Oracle has no dedicated LONGVARBINARY
>type.
>That is, if you create a database via DdlUtils and specify
>LONGVARBINARY, you'll get a BLOB in the database. Now when you read
>this back, the read column will be of type BLOB. And DdlUtils now
>ensures that the column won't be changed when altering something other
>in the database.
>
>This may sound a bit complicated, but in the end this serves to
>support the following workflow:
>
>* create db via DdlUtils
>* change something in the db model via DdlUtils => DdlUtils ensures
>that as few changes as possible will be made to the db
>
>The crucial thing is that this may conflict with the workflow that you've tried:
>
>* create db outside of DdlUtils
>* read the model via DdlUtils
>* change something in the db model via DdlUtils
>
>DdlUtils can only fully support one of these workflows, and IMO the
>first one is more useful, so that's what DdlUtils focuses on. We're
>trying hard to use the native types that the database vendors suggest,
>so that even the second workflow works most of the time, but only if
>the tables use these suggested types. E.g. if you'd use BLOB instead
>of LONG RAW, DdlUtils would not try to change the column.
>
>Btw, for Oracle it is advisable to specify the platform manually. E.g.
>I've changed your code to:
>
>OracleDataSource ods = new OracleDataSource();
>
>ods.setURL("jdbc:oracle:thin:@localhost:1521:orcl");
>ods.setUser("my_user");
>ods.setPassword("my_password");
>
>// note the change to BLOB here
>String sql = "CREATE TABLE MY_TABLE( " +
>               "prop_name VARCHAR2(200) NOT NULL PRIMARY KEY, " +
>               "prop_value VARCHAR2(200), prop_value_ext BLOB)";
>
>Connection conn = ods.getConnection();
>Statement stmt = conn.createStatement();
>
>stmt.execute(sql);
>stmt.close();
>conn.close();
>
>String schema = "MY_SCHEMA";
>
>// I've tested against an Oracle 10 database
>Platform platform = PlatformFactory.createNewPlatformInstance("Oracle10");
>
>platform.setDataSource(ods);
>// the table was created without delimiters, so we should use DdlUtils
>in the same way
>platform.getPlatformInfo().setUseDelimitedIdentifiers(false);
>
>Database db = platform.readModelFromDatabase("test", null, schema, null);
>
>dumpDb(db);
>
>// note the new arguments for catalog, schema, table types
>System.out.println(platform.getAlterTablesSql(null, schema, null, db,
>true, true, true));
>
>
>Tom
>
>  
>