You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by Paco Avila <pa...@git.es> on 2006/03/17 08:50:30 UTC

about document history

Hi, I've seen that Jackrabbit saves the document changes so, I can
restrive a especific document version. How are this versions stored? As
deltas or is the whole document stored earch time?

In my application I need to know the space occupied by any document,
includind its history, can I?


Thanks in advance.

-- 
Paco Avila
GIT Consultors


Re: about document history

Posted by Paco Avila <pa...@git.es>.
El mié, 22-03-2006 a las 01:08 -0800, samiam escribió:
> hi toby
> 
> the usecase is about storing a lot of binary documents (*.doc *.pdf *.tif
> ...) per "project". filesizes will be from a few KBs up to a few MBs. Every
> change in such a document done by a user has to be stored as a version. (it
> is necessary to be able at any time reconstruct all changes in a document
> and to detect who did those changes)
> a document will exists at least a few years and the number of versions can
> grow over 100.
> 
> so we are talking about TBs of data and millions of nodes with binary
> content. therefore performance and used space are two very "hot" topics i
> have to handle.

Me too.

-- 
Paco Avila
GIT Consultors


Re: about document history

Posted by samiam <sa...@ams-engineering.com>.
hi toby

the usecase is about storing a lot of binary documents (*.doc *.pdf *.tif
...) per "project". filesizes will be from a few KBs up to a few MBs. Every
change in such a document done by a user has to be stored as a version. (it
is necessary to be able at any time reconstruct all changes in a document
and to detect who did those changes)
a document will exists at least a few years and the number of versions can
grow over 100.

so we are talking about TBs of data and millions of nodes with binary
content. therefore performance and used space are two very "hot" topics i
have to handle.
--
View this message in context: http://www.nabble.com/about-document-history-t1296233.html#a3528700
Sent from the Jackrabbit - Dev forum at Nabble.com.


Re: about document history

Posted by Tobias Bocanegra <to...@day.com>.
hi sam,
> do you know if any work is going on concerning storing deltas instead of
> total copies?
no, currently not. the difficulty hereby is, that the versionstorage
must be 'browsable' like normal content, i.e. the version nodes inside
the versionstorage must reflect the complete state of the node, from
the time it was versioned. currently,this is simply done, by using a
'normal' persistence manager, and dynamically mapping the nodes in the
workspace.
so an eventual delta can only happen on the persistence layer, or by
ciompletely replace the version manager.

> will it be possible in the future to change a running environment from
> storing copies to storing deltas?`
if implemented, yes.

> will there be a solution for "changing" all versions stored as copies in the
> past to saved deltas?
yes. basically doing an export/import.

i'm interrested, what is the usecase for your questions. i.e. what is
your estimate on how many versions of documents (in total) you will
have, and how much space is used by them? are there tousands of 1mb
documents, having hunders of versions (> 100gb), or what is the scale?

regards, toby
--
-----------------------------------------< tobias.bocanegra@day.com >---
Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
T +41 61 226 98 98, F +41 61 226 98 97
-----------------------------------------------< http://www.day.com >---

Re: about document history

Posted by samiam <sa...@ams-engineering.com>.
hi toby

do you know if any work is going on concerning storing deltas instead of
total copies?
will it be possible in the future to change a running environment from
storing copies to storing deltas?`
will there be a solution for "changing" all versions stored as copies in the
past to saved deltas?

regards,
sam
--
View this message in context: http://www.nabble.com/about-document-history-t1296233.html#a3494803
Sent from the Jackrabbit - Dev forum at Nabble.com.


Re: about document history

Posted by Tobias Bocanegra <to...@day.com>.
hi paco,
the versions are store inside the repository, using a seperate
persistence manager. virtually, they will reside under
/jcr:system/jcr:versionStorage as specified by jsr170. whenever you
checkin a node, all it's content is copied to that location. there is
no delta stored (yet).

regards, toby

On 3/17/06, Paco Avila <pa...@git.es> wrote:
> Hi, I've seen that Jackrabbit saves the document changes so, I can
> restrive a especific document version. How are this versions stored? As
> deltas or is the whole document stored earch time?
>
> In my application I need to know the space occupied by any document,
> includind its history, can I?
>
>
> Thanks in advance.
>
> --
> Paco Avila
> GIT Consultors
>
>


--
-----------------------------------------< tobias.bocanegra@day.com >---
Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
T +41 61 226 98 98, F +41 61 226 98 97
-----------------------------------------------< http://www.day.com >---