You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oodt.apache.org by "Starch, Michael D (388L)" <Mi...@jpl.nasa.gov> on 2012/04/04 22:00:23 UTC

Organization of Java-SQL-DB Interface

All,

I am currently working on designing a catalog implementation that interacts with an oracle database for use with the filemanager component.  The current version uses sql that is inline with the java code.  However, a recent need for maintenance to this SQL has highlighted the need to pull the sql into separate files for easier maintenance.

What is the preferred way in oodt to store such information and load it?

-Michael Starch




Re: Organization of Java-SQL-DB Interface

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Mike,

On Apr 5, 2012, at 1:54 PM, Starch, Michael D (388L) wrote:
[...snip..]
> 
> The issue I am trying to solve is that in our operational environment we
> are starting to run into trouble because of some subtleties of SQL when
> hit with very large databases.  Thus we need our DBA to have the ability
> to iterate on the queries in order to see that they are optimal.  As our
> delivery process is long it would be nice to have the sql editable without
> incurring a month or more to get the code into operations again (where we
> might potentially see a new unforeseen error in the way our sql is laid
> out).

Understood.

> 
> Here is my next proposal, keep the sql in "java" but have the actual data
> access (sql) be in a separate class from the catalog implementation.  This
> sql class can be chosen by a property.  That way our project could
> implement a new sql class in our "adaptation" layer which is easier for us
> to deliver than core code but we would still be using the column-based
> organization.
> 
> It may also make it easier to adapt to other (non-oracle) databases as the
> sql is separated from the code that depends simply on java.sql.* (and a
> JDBC driver) from the code that is oracle specific.
> 
> Chris is this similar to what you were suggesting?

Yep it's precisely what I was suggesting.

1. Create a OracleColumnOrientedQueries.java file that is a java interface that
implements the SQL queries and associated met keys (fields) in the DB.

2. dynamically select which interface .java file selected via config.

An alternative to the Java interface approach is to look at the SqlScript.java
class to implement #1 above.

> 
> Thanks for your input!

NP, let me know if that helps!

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: Organization of Java-SQL-DB Interface

Posted by "Starch, Michael D (388L)" <Mi...@jpl.nasa.gov>.
All,

I will agree from a unit testing perspective it makes it more difficult,
and it makes it harder to keep the SQL queries safe from manipulation from
the outside world after delivery (will be a concern in the future).

The issue I am trying to solve is that in our operational environment we
are starting to run into trouble because of some subtleties of SQL when
hit with very large databases.  Thus we need our DBA to have the ability
to iterate on the queries in order to see that they are optimal.  As our
delivery process is long it would be nice to have the sql editable without
incurring a month or more to get the code into operations again (where we
might potentially see a new unforeseen error in the way our sql is laid
out).

Here is my next proposal, keep the sql in "java" but have the actual data
access (sql) be in a separate class from the catalog implementation.  This
sql class can be chosen by a property.  That way our project could
implement a new sql class in our "adaptation" layer which is easier for us
to deliver than core code but we would still be using the column-based
organization.

It may also make it easier to adapt to other (non-oracle) databases as the
sql is separated from the code that depends simply on java.sql.* (and a
JDBC driver) from the code that is oracle specific.

Chris is this similar to what you were suggesting?

Thanks for your input!

-Michael



On 4/5/12 1:23 PM, "Mattmann, Chris A (388J)"
<ch...@jpl.nasa.gov> wrote:

>Hi Mike,
>
>On Apr 4, 2012, at 1:00 PM, Starch, Michael D (388L) wrote:
>
>> All,
>> 
>> I am currently working on designing a catalog implementation that
>>interacts with an oracle database for use with the filemanager
>>component.  The current version uses sql that is inline with the java
>>code.  However, a recent need for maintenance to this SQL has
>>highlighted the need to pull the sql into separate files for easier
>>maintenance.
>> 
>> What is the preferred way in oodt to store such information and load it?
>
>There is a class called SqlScript, here:
>
>http://s.apache.org/4As
>
>You could feasibly leverage that, but I agree with Brian, in that
>approach it certainly does make it
>harder to unit test. Would it be easier to simply factor the SQL out into
>an abstract interface of MetKeys,
>static final Strings, and then to reference that in the
>ColumnOrientedCatalog?
>
>Cheers,
>Chris
>
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Chris Mattmann, Ph.D.
>Senior Computer Scientist
>NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>Office: 171-266B, Mailstop: 171-246
>Email: chris.a.mattmann@nasa.gov
>WWW:   http://sunset.usc.edu/~mattmann/
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Adjunct Assistant Professor, Computer Science Department
>University of Southern California, Los Angeles, CA 90089 USA
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>


Re: Organization of Java-SQL-DB Interface

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Mike,

On Apr 4, 2012, at 1:00 PM, Starch, Michael D (388L) wrote:

> All,
> 
> I am currently working on designing a catalog implementation that interacts with an oracle database for use with the filemanager component.  The current version uses sql that is inline with the java code.  However, a recent need for maintenance to this SQL has highlighted the need to pull the sql into separate files for easier maintenance.
> 
> What is the preferred way in oodt to store such information and load it?

There is a class called SqlScript, here:

http://s.apache.org/4As

You could feasibly leverage that, but I agree with Brian, in that approach it certainly does make it
harder to unit test. Would it be easier to simply factor the SQL out into an abstract interface of MetKeys, 
static final Strings, and then to reference that in the ColumnOrientedCatalog?

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: Organization of Java-SQL-DB Interface

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Brian,

I shared your concerns, *and* your enthusiasm. 

Mike: I think you can use JIRA issue https://issues.apache.org/jira/browse/OODT-393 and 
we'll be happy to help you shepherd this.

Cheers,
Chris

On Apr 4, 2012, at 1:37 PM, <ho...@me.com> <ho...@me.com> wrote:

> hey mike,
> 
> Are you looking do something like this?  Storing queries in a file and adding in values utilizing PathUtils replacement:
> SELECT DISTINCT product_id from MOA_IASI_L1C_Metadata where element_id = 'urn:peate:NominalDate' and element_value = '[NominalDate]';
> 
> While this is cool and convenient, i see a few problems with dynamically loading the queries:
> 1) The inline queries in Java are able to be well unit-tested
> 2) Making the queries too easy to change can be a production code nightmare... This is the SQL which determines if the Product's metadata gets ingested correctly or not... i wouldn't want to just be able to easily changes these without going through a rigorous testing process to be sure they work correctly
> 
> While i noted these problems, i'm not adverse to this being checked in as a new implementation... i'd be interested to see what you've got... could you create a JIRA issue and attach your code?
> 
> -brian
> 
> Not sure you want these SQL commands to be easy to change... you pull them out then they become hard to unit-test... also 
> 
> On Apr 04, 2012, at 01:00 PM, "Starch, Michael D (388L)" <Mi...@jpl.nasa.gov> wrote:
> 
>> All,
>> 
>> I am currently working on designing a catalog implementation that interacts with an oracle database for use with the filemanager component. The current version uses sql that is inline with the java code. However, a recent need for maintenance to this SQL has highlighted the need to pull the sql into separate files for easier maintenance.
>> 
>> What is the preferred way in oodt to store such information and load it?
>> 
>> -Michael Starch
>> 
>> 
>> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: Organization of Java-SQL-DB Interface

Posted by ho...@me.com.
hey mike,

Are you looking do something like this? �Storing queries in a file and adding in values utilizing PathUtils replacement:
SELECT DISTINCT product_id�from MOA_IASI_L1C_Metadata where element_id = 'urn:peate:NominalDate' and element_value = '[NominalDate]';

While this is cool and convenient, i see a few problems with dynamically loading the queries:
1) The inline queries in Java are able to be well unit-tested
2) Making the queries too easy to change can be a production code nightmare... This is the SQL which determines if the Product's metadata gets ingested correctly or not... i wouldn't want to just be able to easily changes these without going through a rigorous testing process to be sure they work correctly

While i noted these problems, i'm not adverse to this being checked in as a new implementation... i'd be interested to see what you've got... could you create a JIRA issue and attach your code?

-brian

Not sure you want these SQL commands to be easy to change... you pull them out then they become hard to unit-test... also�

On Apr 04, 2012, at 01:00 PM, "Starch, Michael D (388L)" <Mi...@jpl.nasa.gov> wrote:

All,

I am currently working on designing a catalog implementation that interacts with an oracle database for use with the filemanager component. The current version uses sql that is inline with the java code. However, a recent need for maintenance to this SQL has highlighted the need to pull the sql into separate files for easier maintenance.

What is the preferred way in oodt to store such information and load it?

-Michael Starch