You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Daniel Noll <da...@nuix.com> on 2008/09/19 04:09:56 UTC

Client/server plus network storage of database files

Hi all.

I have a question about the case where an embedded Derby database is 
being stored on NFS or CIFS or whatever.

Basically what I'm seeing happen is, a network outage occurs at some 
point in time.  This causes a fairly normal store error as you would 
expect, but then Derby "remembers" that the database is corrupt and 
refuses any further operations, even if the network comes back up.

We can attempt to get around this when operating in pure embedded mode 
by doing a full shutdown and re-open, and this does work.

But if we're in server mode and the connection from Derby to the storage 
goes down, i.e.:

    Derby Client ------> Derby Server ---X--> Database on NAS

Is there some way to handle this situation gracefully?  It appears that 
network server doesn't attempt a shutdown / connect when this occurs, it 
simply keeps the store open and in unusable state.

Or is a better solution to make the store layer more aware that these 
problems can be temporary?

Daniel


-- 
Daniel Noll                            Forensic and eDiscovery Software
Senior Developer                              The world's most advanced
Nuix                                                email data analysis
http://nuix.com/                                and eDiscovery software

RE: Client/server plus network storage of database files

Posted by Maris Orbidans <ma...@ingg.com>.
>  Why is it 
>connecting across a network?  Because locally attached disks aren't big

>enough.

Put your derby server on that box where disks are big enough.

**********************************************************************
Confidentiality : This e-mail and any attachments are intended for the addressee only and may be confidential. If they come to you in error you must take no action based on them, nor must you copy or show them to anyone. Please advise the sender by replying to this e-mail immediately and then delete the original from your computer.

Opinion : Any opinions expressed in this e-mail are entirely those of the author and unless specifically stated to the contrary, are not necessarily those of the author�s employer.
 
Security Warning : Internet e-mail is not necessarily a secure communications medium and can be subject to data corruption. We advise that you consider this fact when e-mailing us. 

Viruses : We have taken steps to ensure that this e-mail and any attachments are free from known viruses but in keeping with good computing practice, you should ensure that they are virus free.

Inspired Gaming (UK) Limited
Registered in England No 3565640
Registered Office 3 The Maltings Wetmore Road, Burton On Trent, Staffordshire DE14 1SE
_______________________________________________________________
This message has been checked for all known viruses by the MessageLabs Virus Control Centre.

Re: Client/server plus network storage of database files

Posted by Daniel Noll <da...@nuix.com>.
Knut Anders Hatlen wrote:
> Another possible issue with Derby and network file systems, is that
> Derby uses a file locking protocol to prevent double-booting of a
> database. This protocol only works if the two instances that try to boot
> Derby run on the same host. If they run on different hosts and access
> the same database over a network fs, the database is likely going to be
> corrupted.

I have tested this fairly recently, and CIFS (from Windows, to Samba -- 
not sure about other combinations) appears to work as one would expect 
with regard to file locks.

The diagram from my previous email still applies -- clients aren't 
connecting to the shared database, they're connecting to the network 
server and the network server is connecting across a network.  Why is it 
connecting across a network?  Because locally attached disks aren't big 
enough.

Daniel


-- 
Daniel Noll                            Forensic and eDiscovery Software
Senior Developer                              The world's most advanced
Nuix                                                email data analysis
http://nuix.com/                                and eDiscovery software

Re: Client/server plus network storage of database files

Posted by Knut Anders Hatlen <Kn...@Sun.COM>.
Daniel Noll <da...@nuix.com> writes:

> Stanley Bradbury wrote:
>> The problem is that I/O across a network cannot to be guaranteed.
>> You are actually losing data buffered along the network when the
>> network fails.  Derby detects this when is boots the next time and
>> finds the checksums and timestamps on the files int the database are
>> out of sync.
>> Derby database files must reside local to the machine hosting the
>> Derby DBMS engine.
>
> So what does Derby define as "across a network"?  FireWire is
> technically a network protocol after all.

Derby doesn't define anything as "across the network", it just relies on
the file system to be well-behaved. So basically as long as the file
system guarantees that sync calls don't return until the changes have
been written to persistent storage, Derby should work OK. Otherwise, one
may end up with inconsistencies between the transaction log and the data
files, and the database is left in an unbootable state.

Another possible issue with Derby and network file systems, is that
Derby uses a file locking protocol to prevent double-booting of a
database. This protocol only works if the two instances that try to boot
Derby run on the same host. If they run on different hosts and access
the same database over a network fs, the database is likely going to be
corrupted.

-- 
Knut Anders

Re: Client/server plus network storage of database files

Posted by Daniel Noll <da...@nuix.com>.
Valentin Cozma wrote:
> Stanley Bradbury wrote:
>> Not being a hardware or network guy I  can't speak in detail to all 
>> the TLAs you are using but do know the following is documented:  Derby 
>> database files must reside local to the machine hosting the Derby DBMS 
>> engine.  So that leaves out anything over TCP/IP and I know that NFS 
>> mounts, Windows file shares and Samba mounts.  These systems will 
>> result in corruptions because there is not way to insure that a 
>> physical write to disk was completed.
> 
> from what I know the problem is that nfs blocks and doesn't raise an 
> exception until SO tcp timeout is reached ( linux default 2 hours )
> 
> don't know about samba

With CIFS/SMB you issue SMB_COM_FLUSH to flush, and it's documented as 
flushing to disk -- Samba can honor this (but the setting defaults to 
off).  Also, what you describe is NFS2 -- NFS3 can work asynchronously 
if you want, but it can still operate in NFS2 mode as well.

So really, CIFS and NFS both appear to be okay for syncing -- NFS is 
trickier to get working properly for locking, but that's another issue.

In any case, in my case the database has not been corrupted -- the 
network went down, came back up, and Derby *thought* the database was 
corrupted, but on forcing a shutdown and opening it again it worked.

So my original question stands but I'll rephrase it -- in this 
situation, is there some way Derby can do this reconnection itself at 
the store level, or is there some way it can do it at the Network Server 
level, or do we have to shut down the entire Network Server and all its 
embedded databases to do it at our own level?

Daniel


-- 
Daniel Noll                            Forensic and eDiscovery Software
Senior Developer                              The world's most advanced
Nuix                                                email data analysis
http://nuix.com/                                and eDiscovery software

Re: Client/server plus network storage of database files

Posted by Valentin Cozma <va...@atreides.ro>.
Stanley Bradbury wrote:
> Not being a hardware or network guy I  can't speak in detail to all 
> the TLAs you are using but do know the following is documented:  Derby 
> database files must reside local to the machine hosting the Derby DBMS 
> engine.  So that leaves out anything over TCP/IP and I know that NFS 
> mounts, Windows file shares and Samba mounts.  These systems will 
> result in corruptions because there is not way to insure that a 
> physical write to disk was completed. 
>

from what I know the problem is that nfs blocks and doesn't raise an 
exception until SO tcp timeout is reached ( linux default 2 hours )

don't know about samba

Re: Client/server plus network storage of database files

Posted by Stanley Bradbury <St...@gmail.com>.
Daniel Noll wrote:
> Stanley Bradbury wrote:
>> The problem is that I/O across a network cannot to be guaranteed.  
>> You are actually losing data buffered along the network when the 
>> network fails.  Derby detects this when is boots the next time and 
>> finds the checksums and timestamps on the files int the database are 
>> out of sync.
>> Derby database files must reside local to the machine hosting the 
>> Derby DBMS engine.
>
> So what does Derby define as "across a network"?  FireWire is 
> technically a network protocol after all.
>
> Does this just mean I/O across TCP/IP?  Or just I/O across SMB?  Or 
> across NFS and SMB?  Or generally any network protocol which happens 
> to buffer?  But directly connected disks also buffer, so presumably it 
> isn't that last option.
>
> Daniel
>
>
Not being a hardware or network guy I  can't speak in detail to all the 
TLAs you are using but do know the following is documented:  Derby 
database files must reside local to the machine hosting the Derby DBMS 
engine.  So that leaves out anything over TCP/IP and I know that NFS 
mounts, Windows file shares and Samba mounts.  These systems will result 
in corruptions because there is not way to insure that a physical write 
to disk was completed.  


Re: Client/server plus network storage of database files

Posted by Daniel Noll <da...@nuix.com>.
Stanley Bradbury wrote:
> The problem is that I/O across a network cannot to be guaranteed.  You 
> are actually losing data buffered along the network when the network 
> fails.  Derby detects this when is boots the next time and finds the 
> checksums and timestamps on the files int the database are out of sync.
> Derby database files must reside local to the machine hosting the Derby 
> DBMS engine.

So what does Derby define as "across a network"?  FireWire is 
technically a network protocol after all.

Does this just mean I/O across TCP/IP?  Or just I/O across SMB?  Or 
across NFS and SMB?  Or generally any network protocol which happens to 
buffer?  But directly connected disks also buffer, so presumably it 
isn't that last option.

Daniel


-- 
Daniel Noll                            Forensic and eDiscovery Software
Senior Developer                              The world's most advanced
Nuix                                                email data analysis
http://nuix.com/                                and eDiscovery software

Re: Client/server plus network storage of database files

Posted by Stanley Bradbury <St...@gmail.com>.
Daniel Noll wrote:
> Hi all.
>
> I have a question about the case where an embedded Derby database is 
> being stored on NFS or CIFS or whatever.
>
> Basically what I'm seeing happen is, a network outage occurs at some 
> point in time.  This causes a fairly normal store error as you would 
> expect, but then Derby "remembers" that the database is corrupt and 
> refuses any further operations, even if the network comes back up.
>
> We can attempt to get around this when operating in pure embedded mode 
> by doing a full shutdown and re-open, and this does work.
>
> But if we're in server mode and the connection from Derby to the 
> storage goes down, i.e.:
>
>    Derby Client ------> Derby Server ---X--> Database on NAS
>
> Is there some way to handle this situation gracefully?  It appears 
> that network server doesn't attempt a shutdown / connect when this 
> occurs, it simply keeps the store open and in unusable state.
>
> Or is a better solution to make the store layer more aware that these 
> problems can be temporary?
>
> Daniel
>
>
The problem is that I/O across a network cannot to be guaranteed.  You 
are actually losing data buffered along the network when the network 
fails.  Derby detects this when is boots the next time and finds the 
checksums and timestamps on the files int the database are out of sync. 

Derby database files must reside local to the machine hosting the Derby 
DBMS engine.