You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Thomas Mueller (JIRA)" <ji...@apache.org> on 2010/07/19 10:42:51 UTC

[jira] Commented: (JCR-2677) Extend the FileDataStore implementation to support read-only media (eg. WORMs)

    [ https://issues.apache.org/jira/browse/JCR-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889788#action_12889788 ] 

Thomas Mueller commented on JCR-2677:
-------------------------------------

That would be a nice feature.

With WORM media you might want to create the temp file in the temp directory, and not in the datastore directory. This would also allow distributing the data store (using soft links to other disks within the datastore directory). In this case, you also can't simply "rename" a temp file because the target directory is on a different drive than the source directory.

But that means creating the file in the "final" place is no longer an atomic "rename" (you have to copy the file block by block). This is a problem for concurrent writes (currently the FileDataStore would throw an exception saying the size doesn't match). But it's also a problem for concurrent reads: How can you detect the non-atomic write operation is still running? Are file locked while they are created, or will reading from the input stream automatically wait for the writer? If this is not the case, can we just re-try a few times (until the file doesn't change for 10 seconds for example)?

For WORM media, what if an exception occurs while the file is being created (so the file is broken)?

For such cases, what about creating multiple versions of the same file if we detect the current one is broken, for example:

1d26ee96b6b5b886a3ac2b68df0636c97db5fbfd (broken)
1d26ee96b6b5b886a3ac2b68df0636c97db5fbfd-1 (correct)

Broken files would be very rare of course, but I guess we still need a way to solve this problem. The algorithm could support multiple broken files of course, and just use the suffix -2, -3,..., until it works (up to a limit of maybe 100).



> Extend the FileDataStore implementation to support read-only media (eg. WORMs)
> ------------------------------------------------------------------------------
>
>                 Key: JCR-2677
>                 URL: https://issues.apache.org/jira/browse/JCR-2677
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>    Affects Versions: 2.2.0
>            Reporter: Ulrich Cech
>
> Actually, the FileDataStore does not support read-only media. In a professional environment, where data consistence and unchangable of data is important (like archiving systems) this functionality is very important.
> I would try to do the implementation and contribute it...
> I attachted the conversation of the jackrabbit users mailinglist:
> -----Ursprüngliche Nachricht-----
> Von: Thomas Müller [mailto:thomas.mueller@day.com] 
> Gesendet: Mittwoch, 14. Juli 2010 11:52
> An: users@jackrabbit.apache.org
> Betreff: Re: Jackrabbit and WORM
> Hi,
> > written to read-only media
> Do you mean written to write-only media? The DataStore implementation
> does not support this feature currently, however you could probably
> change the FileDataStore to support it. Instead of writing the
> temporary file to the datastore directory, it would have to be written
> to a different place (the temp directory for example). If you don't
> have a temp directory then it's a bit more complicated (binaries would
> need to be split into smaller blocks that fit in memory).
> Regards,
> Thomas
> On Wed, Jul 14, 2010 at 11:42 AM, Cech. Ulrich <Ul...@aeb.de> wrote:
> > I have problems using JackRabbit with a storage-system, where files could only be added, but not changed or deleted.
> > I found out, that in BinaryImpl.class there is created a TransientFileFactory, where the stream is written in a temporary file and later be deleted. If this deletion fails, I get an exception
> > ...
> > Caused by: java.io.IOException: Can not rename c:\temp\cr20fs\repository\datastore\tmp21866.tmp to c:\temp\cr20fs\repository\datastore\8d\54\82\8d548201d39d7594d182c2a3901fa38dfeebc6b3 (media read only?)
> > ...
> >
> > I tried to set the DataStore parameter "minRecordLength" to a very high value, so that the stream is handled in memory, but this is limited to the available heap space and so not applicable.
> >
> > Has anyone some experiences with Jackrabbit and read-only media? Can it be configured, that only the repository and the versions are written to read-only media, but other files (like the Lucene index, which could be well configured to some other directory, so that's no problem) is written to some "normal" storage system?
> >
> > Many thanks in advance,
> > Ulrich

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.