You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Aaron Zeckoski <aa...@vt.edu> on 2008/11/04 19:17:42 UTC

Embedded database which only stores data in memory?

I am trying to use embedded derby for testing but I am finding it much
slower than HSQLDB to start and run and significantly more annoying
since I have to remove the actual files between test runs to ensure my
database is clean. Is there a way to force derby to not create any
files and therefore operate more like something like HSQLDB?

-AZ


-- 
Aaron Zeckoski (aaronz@vt.edu)
Senior Research Engineer - CARET - Cambridge University
[http://bugs.sakaiproject.org/confluence/display/~aaronz/]
Sakai Fellow - [http://aaronz-sakai.blogspot.com/]

Re: Embedded database which only stores data in memory?

Posted by Rick Hillegas <Ri...@Sun.COM>.
What Bryan says. One useful tip is to bounce your schemas rather than 
the whole database in between test cases. That is, instead of recreating 
the database for each test case, just drop all of the tables, views, 
routines, and permissions. JUnit's setup()/tearDown() idiom makes it 
easy to implement this technique. Try cloning the Derby test decorator 
called CleanDatabaseTestSetup.

Hope this helps,
-Rick

Bryan Pendleton wrote:
>> Just to chime in here.  We also use Derby for deployment, and are 
>> having the same grief with setup time for unit tests 
>
> For what it's worth, Derby itself uses Derby in its own unit tests (of
> course), and overall we have quite good performance, I believe, in the
> Derby unit tests themselves.
>
> The Derby source tree contains extensive tests and test utilities, and
> is a great source of ideas about how to set up unit tests to work with
> a database efficiently.
>
> You might try exploring that body of testing code, and you might try
> contacting the folks on the derby-dev list to discuss the particular
> issues you're seeing.
>
> There's also an extensive section in the Derby wiki which discusses
> the technique behind the Derby unit testing harness.
>
> I believe that, with a certain amount of care, you should be able to
> achieve quite good performance for your unit testing using Derby.
>
> thanks,
>
> bryan


RE: Embedded database which only stores data in memory?

Posted by Jim Newsham <jn...@referentia.com>.

> -----Original Message-----
> From: azeckoski@gmail.com [mailto:azeckoski@gmail.com] On Behalf Of Aaron
> Zeckoski
> Sent: Wednesday, November 05, 2008 10:52 PM
> To: Derby Discussion
> Subject: Re: Embedded database which only stores data in memory?
> 
> Do you have the code for this that you can share?
> 
> On Wed, Nov 5, 2008 at 7:31 PM, Jim Newsham <jn...@referentia.com>
> wrote:
> >
> >
> > No need for manual deletion or an external script... just clear out the
> > database in your test tear-down code.  Here is our strategy (junit 4.x):
> >
> > - Before all test cases (@BeforeClass):  Generate a temporary directory
> > randomly; create a database there, to be used by tests.
> > - After each test case (@After):  Drop all tables from the database
> > - After all test cases (@AfterClass):  Delete the temporary directory
> > recursively.
> >
> > Jim

  private static File derbyDir;
  private static DataSource dataSource;
  private static DataSource shutdownDataSource;
  
  @BeforeClass
  public static void createDatabase() throws Exception {
    // create a temp dir for holding the database
    derbyDir = makeTempDir();
    File dbDir = new File(derbyDir, "db");
    
    // create database and initialize it
    dataSource = getCreateDataSource(dbDir);
    initDatabaseStructureOrWhatever();
    
    // create a data source for shutting down later
    shutdownDataSource = getShutdownDataSource(dbDir);
  }
  
  /**
   * Shuts down and deletes the temporary database after all tests have 
   * completed.
   */
  @AfterClass
  public static void deleteDatabase() throws Exception {
    try {
      shutdownDataSource.getConnection();
    }
    catch(SQLException sqle) {
      // successful shutdown throws an exception
    }
    System.gc();
    assertTrue("failed to delete temp dir", deleteRecursively(derbyDir));
    derbyDir = null;
  }
  
  @Before
  public void initDatabase() throws Exception {
    //...do standard per-test database setup
    //...this might involve creating tables, indexes, or whatever
  }
  
  @After
  public void cleanupDatabase() throws Exception {
    //...do standard per-test database cleanup
  }
  
  @After
  public void resetDatabase() throws Exception {
    //... reset the database to a clean state for the next test
    //... this may involve dropping tables, indexes, or whatever
  }


  /**
   * Gets a data source for the derby database at the given path, and which 
   * creates the database if it doesn't already exist.  The data source will
   * connect to a local derby database using embedded mode.
   * @param path the path to the directory containing the database
   * @return a data source
   */
  private static DataSource getCreateDataSource(File path) {
    if (path == null) { 
      throw new IllegalArgumentException("path is null");
    }
    EmbeddedDataSource dataSource = new EmbeddedDataSource();
    dataSource.setDatabaseName(path.getPath());
    dataSource.setCreateDatabase("create");
    return dataSource;
  }
  
  /**
   * Gets a data source for the derby database at the given path, and which 
   * shuts down the database when it is connected to.  The data source will
   * connect to a local derby database using embedded mode.
   * @param path the path to the directory containing the database
   * @return a data source
   */
  private static DataSource getShutdownDataSource(File path) {
    if (path == null) { 
      throw new IllegalArgumentException("path is null");
    }
    EmbeddedDataSource dataSource = new EmbeddedDataSource();
    dataSource.setDatabaseName(path.getPath());
    dataSource.setShutdownDatabase("shutdown");
    return dataSource;
  }


  private static final String RANDOM_FILENAME_CHARS =
"abcdefghijklmnopqrstuvwxyz";
  private static final int DEFAULT_RANDOM_FILENAME_LENGTH = 12;
  
  /**
   * Gets the system temporary directory.
   * @return the system temporary directory
   */
  private static File getSystemTempDir() {
    return new File(System.getProperty("java.io.tmpdir"));
  }
  
  /**
   * Creates and returns a temporary directory under the system temporary 
   * directory.  The directory will be deleted on system exit.
   * @return a newly created temporary directory which will be deleted on 
   *         system exit
   */
  private static File makeTempDir() {
    File sysDir = getSystemTempDir();
    File tempDir = null;
    while (tempDir == null || tempDir.exists()) {
      tempDir = new File(sysDir, generateRandomFilename());
    }
    tempDir.mkdir();
    tempDir.deleteOnExit();
    return tempDir;
  }
  
  /**
   * Deletes the given file or directory, and all of its contained files and
   * directories.  If deletion does not complete successfully, some files
may
   * have been deleted.
   * @param file the file or directory to delete 
   * @return whether the file and its children were deleted successfully
   */
  private static boolean deleteRecursively(File file) {
    File[] files = file.listFiles();
    if (files != null) {
      for (File f : files) {
        if (!deleteRecursively(f)) {
          return false;
        }
      }
    }
    return file.delete();
  }
  
  /**
   * Generates a random string suitable for use as a filename.  Because the
   * string is randomly generated, it is unlikely though possible that a 
   * matching file exists.
   * @return a random string suitable for use as a filename
   */
  private static String generateRandomFilename() {
    return generateRandomFilename(DEFAULT_RANDOM_FILENAME_LENGTH);
  }
  
  /**
   * Generates a random string suitable for use as a filename.  Because the
   * string is randomly generated, it is unlikely though possible that a 
   * matching file exists.
   * @param length the filename length, in characters; must be positive
   * @return a random string suitable for use as a filename
   */
  private static String generateRandomFilename(int length) {
    if (length < 1) {
      throw new IllegalArgumentException("length is not positive");
    }
    Random rand = new Random();
    char[] filename = new char[length];
    for (int i = 0; i < filename.length; i++) {
      int index = (rand.nextInt() & Integer.MAX_VALUE) %
RANDOM_FILENAME_CHARS.length();
      filename[i] = RANDOM_FILENAME_CHARS.charAt(index);
    }
    return new String(filename);
  }
  



Re: Embedded database which only stores data in memory?

Posted by Aaron Zeckoski <aa...@vt.edu>.
Do you have the code for this that you can share?

On Wed, Nov 5, 2008 at 7:31 PM, Jim Newsham <jn...@referentia.com> wrote:
>
>
> No need for manual deletion or an external script... just clear out the
> database in your test tear-down code.  Here is our strategy (junit 4.x):
>
> - Before all test cases (@BeforeClass):  Generate a temporary directory
> randomly; create a database there, to be used by tests.
> - After each test case (@After):  Drop all tables from the database
> - After all test cases (@AfterClass):  Delete the temporary directory
> recursively.
>
> Jim
>
>> -----Original Message-----
>> From: azeckoski@gmail.com [mailto:azeckoski@gmail.com] On Behalf Of Aaron
>> Zeckoski
>> Sent: Wednesday, November 05, 2008 7:36 AM
>> To: Derby Discussion
>> Subject: Re: Embedded database which only stores data in memory?
>>
>> Thanks for the responses.
>>
>> To answer the earliest question above, we are just using Derby for
>> testing. There are a couple major missing features that make it
>> unsuitable for our database needs for production. We were using HSQLDB
>> for testing before but have found the lack or true transactions to be
>> a real issue (i.e. not catching TX problems in the tests). It is
>> extremely easy to setup and very very fast to start though so those
>> aspects of HSQLDB made it almost ideal.
>>
>> The suggestion about looking at the test cases seems reasonable so I
>> will do that shortly.
>> I am guessing you mean the stuff here right?
>> http://svn.apache.org/repos/asf/db/derby/code/trunk/java/testing/org/apach
>> e/derbyTesting/
>> I had a little trouble figuring out the source structure so let me
>> know if this is the wrong stuff to be looking at.
>>
>> There is still a big issue no one in this thread has addressed yet and
>> that it the fact that derby creates files in the file system for each
>> test which we have to manually clear (or end up with tests that fail
>> because the derby database is already there and has unexpected data in
>> it). We are trying to drop derby in without changing our tests much
>> and so far I have been able to limit the changes to about 200 lines of
>> code (which already seems like a lot). Unfortunately, this is playing
>> havok with our CI server because we have to run an extra script now to
>> clear out the derby files before and after each build (just to be
>> sure).
>> I imagine there must be some option to disable the files creation. If
>> anyone knows how to do this please let me know.
>> The startup time is annoying but we can live with slower tests if they
>> are more reliable.
>>
>> Apologies if these are naive questions or I come off a bit strong. I
>> am under some time pressure here and vastly underestimated how long it
>> would take to switch from HSQLDB to Derby for testing.
>> Suggestions appreciated.
>> -AZ
>>
>>
>> On Wed, Nov 5, 2008 at 12:26 AM, Bryan Pendleton
>> <bp...@amberpoint.com> wrote:
>> >> Just to chime in here.  We also use Derby for deployment, and are
>> having
>> >> the same grief with setup time for unit tests
>> >
>> > For what it's worth, Derby itself uses Derby in its own unit tests (of
>> > course), and overall we have quite good performance, I believe, in the
>> > Derby unit tests themselves.
>> >
>> > The Derby source tree contains extensive tests and test utilities, and
>> > is a great source of ideas about how to set up unit tests to work with
>> > a database efficiently.
>> >
>> > You might try exploring that body of testing code, and you might try
>> > contacting the folks on the derby-dev list to discuss the particular
>> > issues you're seeing.
>> >
>> > There's also an extensive section in the Derby wiki which discusses
>> > the technique behind the Derby unit testing harness.
>> >
>> > I believe that, with a certain amount of care, you should be able to
>> > achieve quite good performance for your unit testing using Derby.
>> >
>> > thanks,
>> >
>> > bryan
>> >
>>
>>
>>
>> --
>> Aaron Zeckoski (aaronz@vt.edu)
>> Senior Research Engineer - CARET - Cambridge University
>> [http://bugs.sakaiproject.org/confluence/display/~aaronz/]
>> Sakai Fellow - [http://aaronz-sakai.blogspot.com/]
>
>
>
>



-- 
Aaron Zeckoski (aaronz@vt.edu)
Senior Research Engineer - CARET - Cambridge University
[http://bugs.sakaiproject.org/confluence/display/~aaronz/]
Sakai Fellow - [http://aaronz-sakai.blogspot.com/]

RE: Embedded database which only stores data in memory?

Posted by Jim Newsham <jn...@referentia.com>.

No need for manual deletion or an external script... just clear out the
database in your test tear-down code.  Here is our strategy (junit 4.x):

- Before all test cases (@BeforeClass):  Generate a temporary directory
randomly; create a database there, to be used by tests.
- After each test case (@After):  Drop all tables from the database
- After all test cases (@AfterClass):  Delete the temporary directory
recursively.

Jim

> -----Original Message-----
> From: azeckoski@gmail.com [mailto:azeckoski@gmail.com] On Behalf Of Aaron
> Zeckoski
> Sent: Wednesday, November 05, 2008 7:36 AM
> To: Derby Discussion
> Subject: Re: Embedded database which only stores data in memory?
> 
> Thanks for the responses.
> 
> To answer the earliest question above, we are just using Derby for
> testing. There are a couple major missing features that make it
> unsuitable for our database needs for production. We were using HSQLDB
> for testing before but have found the lack or true transactions to be
> a real issue (i.e. not catching TX problems in the tests). It is
> extremely easy to setup and very very fast to start though so those
> aspects of HSQLDB made it almost ideal.
> 
> The suggestion about looking at the test cases seems reasonable so I
> will do that shortly.
> I am guessing you mean the stuff here right?
> http://svn.apache.org/repos/asf/db/derby/code/trunk/java/testing/org/apach
> e/derbyTesting/
> I had a little trouble figuring out the source structure so let me
> know if this is the wrong stuff to be looking at.
> 
> There is still a big issue no one in this thread has addressed yet and
> that it the fact that derby creates files in the file system for each
> test which we have to manually clear (or end up with tests that fail
> because the derby database is already there and has unexpected data in
> it). We are trying to drop derby in without changing our tests much
> and so far I have been able to limit the changes to about 200 lines of
> code (which already seems like a lot). Unfortunately, this is playing
> havok with our CI server because we have to run an extra script now to
> clear out the derby files before and after each build (just to be
> sure).
> I imagine there must be some option to disable the files creation. If
> anyone knows how to do this please let me know.
> The startup time is annoying but we can live with slower tests if they
> are more reliable.
> 
> Apologies if these are naive questions or I come off a bit strong. I
> am under some time pressure here and vastly underestimated how long it
> would take to switch from HSQLDB to Derby for testing.
> Suggestions appreciated.
> -AZ
> 
> 
> On Wed, Nov 5, 2008 at 12:26 AM, Bryan Pendleton
> <bp...@amberpoint.com> wrote:
> >> Just to chime in here.  We also use Derby for deployment, and are
> having
> >> the same grief with setup time for unit tests
> >
> > For what it's worth, Derby itself uses Derby in its own unit tests (of
> > course), and overall we have quite good performance, I believe, in the
> > Derby unit tests themselves.
> >
> > The Derby source tree contains extensive tests and test utilities, and
> > is a great source of ideas about how to set up unit tests to work with
> > a database efficiently.
> >
> > You might try exploring that body of testing code, and you might try
> > contacting the folks on the derby-dev list to discuss the particular
> > issues you're seeing.
> >
> > There's also an extensive section in the Derby wiki which discusses
> > the technique behind the Derby unit testing harness.
> >
> > I believe that, with a certain amount of care, you should be able to
> > achieve quite good performance for your unit testing using Derby.
> >
> > thanks,
> >
> > bryan
> >
> 
> 
> 
> --
> Aaron Zeckoski (aaronz@vt.edu)
> Senior Research Engineer - CARET - Cambridge University
> [http://bugs.sakaiproject.org/confluence/display/~aaronz/]
> Sakai Fellow - [http://aaronz-sakai.blogspot.com/]




Re: Embedded database which only stores data in memory?

Posted by Aaron Zeckoski <aa...@vt.edu>.
Thanks for the responses.

To answer the earliest question above, we are just using Derby for
testing. There are a couple major missing features that make it
unsuitable for our database needs for production. We were using HSQLDB
for testing before but have found the lack or true transactions to be
a real issue (i.e. not catching TX problems in the tests). It is
extremely easy to setup and very very fast to start though so those
aspects of HSQLDB made it almost ideal.

The suggestion about looking at the test cases seems reasonable so I
will do that shortly.
I am guessing you mean the stuff here right?
http://svn.apache.org/repos/asf/db/derby/code/trunk/java/testing/org/apache/derbyTesting/
I had a little trouble figuring out the source structure so let me
know if this is the wrong stuff to be looking at.

There is still a big issue no one in this thread has addressed yet and
that it the fact that derby creates files in the file system for each
test which we have to manually clear (or end up with tests that fail
because the derby database is already there and has unexpected data in
it). We are trying to drop derby in without changing our tests much
and so far I have been able to limit the changes to about 200 lines of
code (which already seems like a lot). Unfortunately, this is playing
havok with our CI server because we have to run an extra script now to
clear out the derby files before and after each build (just to be
sure).
I imagine there must be some option to disable the files creation. If
anyone knows how to do this please let me know.
The startup time is annoying but we can live with slower tests if they
are more reliable.

Apologies if these are naive questions or I come off a bit strong. I
am under some time pressure here and vastly underestimated how long it
would take to switch from HSQLDB to Derby for testing.
Suggestions appreciated.
-AZ


On Wed, Nov 5, 2008 at 12:26 AM, Bryan Pendleton
<bp...@amberpoint.com> wrote:
>> Just to chime in here.  We also use Derby for deployment, and are having
>> the same grief with setup time for unit tests
>
> For what it's worth, Derby itself uses Derby in its own unit tests (of
> course), and overall we have quite good performance, I believe, in the
> Derby unit tests themselves.
>
> The Derby source tree contains extensive tests and test utilities, and
> is a great source of ideas about how to set up unit tests to work with
> a database efficiently.
>
> You might try exploring that body of testing code, and you might try
> contacting the folks on the derby-dev list to discuss the particular
> issues you're seeing.
>
> There's also an extensive section in the Derby wiki which discusses
> the technique behind the Derby unit testing harness.
>
> I believe that, with a certain amount of care, you should be able to
> achieve quite good performance for your unit testing using Derby.
>
> thanks,
>
> bryan
>



-- 
Aaron Zeckoski (aaronz@vt.edu)
Senior Research Engineer - CARET - Cambridge University
[http://bugs.sakaiproject.org/confluence/display/~aaronz/]
Sakai Fellow - [http://aaronz-sakai.blogspot.com/]

Re: Embedded database which only stores data in memory?

Posted by Bryan Pendleton <bp...@amberpoint.com>.
> Just to chime in here.  We also use Derby for deployment, and are having 
> the same grief with setup time for unit tests 

For what it's worth, Derby itself uses Derby in its own unit tests (of
course), and overall we have quite good performance, I believe, in the
Derby unit tests themselves.

The Derby source tree contains extensive tests and test utilities, and
is a great source of ideas about how to set up unit tests to work with
a database efficiently.

You might try exploring that body of testing code, and you might try
contacting the folks on the derby-dev list to discuss the particular
issues you're seeing.

There's also an extensive section in the Derby wiki which discusses
the technique behind the Derby unit testing harness.

I believe that, with a certain amount of care, you should be able to
achieve quite good performance for your unit testing using Derby.

thanks,

bryan

Re: Embedded database which only stores data in memory?

Posted by Daniel Noll <da...@nuix.com>.
Dyre.Tjeldvoll@Sun.COM wrote:
> But would you also use Derby in deployment? Presumably usage of the
> database would be rather different in deployment? I mean, not may
> applications put temporary data in a relational database, so I'm
> guessing that in deployment you would not want to throw away the
> database files each time you close your application, right? 

Just to chime in here.  We also use Derby for deployment, and are having 
the same grief with setup time for unit tests (I know there is the 
option of using some other JDBC engine for testing, but we like to test 
against the same database we're deploying...)

It has gotten to the point where some test case classes have been 
reduced to a single test method, which is somewhat counter to the spirit 
of unit tests.

For tricks in the meantime:

  - Using a RAM disk presumably works but we haven't tried.

  - Turning off reliability features? (not important for tests)

  - Anything else?

Daniel


-- 
Daniel Noll                            Forensic and eDiscovery Software
Senior Developer                              The world's most advanced
Nuix                                                email data analysis
http://nuix.com/                                and eDiscovery software

Re: Embedded database which only stores data in memory?

Posted by Dy...@Sun.COM.
Aaron Zeckoski <aa...@vt.edu> writes:

> I am trying to use embedded derby for testing but I am finding it much
> slower than HSQLDB to start and run and significantly more annoying
> since I have to remove the actual files between test runs to ensure my
> database is clean. Is there a way to force derby to not create any
> files and therefore operate more like something like HSQLDB?

I believe there is an existing Jira issue for this, and that someone
started working on it, but the work was never completed.

It would be interesting to know how you were planning to use Derby if it
could run in memory only. Based on what you write I'm assuming that you
are running (unit) test frame work (maybe even JUnit) that has a large
number of test cases, and that it somehow is inconvenient for you to let
all test cases use the same database and clean up afterwards. 

But would you also use Derby in deployment? Presumably usage of the
database would be rather different in deployment? I mean, not may
applications put temporary data in a relational database, so I'm
guessing that in deployment you would not want to throw away the
database files each time you close your application, right? 

And while I'm sure the problem you have with testing is a pain, I'm not
so sure Derby developers will queue up to solve it, as they probably are
more interested in making Derby a good database for deployments. 

-- 
dt