You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by "Vyacheslav V. Zholudev" <vy...@gmail.com> on 2008/06/01 10:02:10 UTC
Multiple entires for the same key in the strings table
Hey all!
Could somebody explain me please, why we can have multiple entries for
the same key in the 'strings' table in case of BDB? I mean what are the
reasons behind it?
Thanks!
Best,
Vyacheslav
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Multiple entires for the same key in the strings table
Posted by "C. Michael Pilato" <cm...@collab.net>.
Vyacheslav V. Zholudev wrote:
> Michael, you said that you used to write data into one row in BDB. How
> did you do that? In particular, how did you accumulate the chunks which
> were got from a stream? Am I right that you had to cache everything
> somewhere and then when the whole fulltext was comprised, you flushed
> the whole fulltext data to BDB?
Nope. Subversion has never allowed itself to hold all of a file's contents
in memory -- that's just too risky scalability-wise. We used BDB's partial
value access logic to simply append the new bits to the existing
(contents-in-progress) value.
--
C. Michael Pilato <cm...@collab.net>
CollabNet <> www.collab.net <> Distributed Development On Demand
Re: Multiple entires for the same key in the strings table
Posted by "Vyacheslav V. Zholudev" <vy...@gmail.com>.
Michael, you said that you used to write data into one row in BDB. How
did you do that? In particular, how did you accumulate the chunks which
were got from a stream? Am I right that you had to cache everything
somewhere and then when the whole fulltext was comprised, you flushed
the whole fulltext data to BDB?
C. Michael Pilato wrote:
> Vyacheslav V. Zholudev wrote:
>> Hey all!
>>
>> Could somebody explain me please, why we can have multiple entries
>> for the same key in the 'strings' table in case of BDB? I mean what
>> are the reasons behind it?
>
> We get data piecemeal, in chunks, though the interfaces that write
> file contents to the 'strings' table. We used to simply append this
> data to the one "row" for that string ID, but then we noticed that
> Berkeley DB's write-ahead logging subsystem wanted to replicate the
> whole string-so-far value over and over again in the log.* files (so
> the transaction could be replayed if necessary). This caused
> obnoxious bloat of the disk. We solved this issue (and possibly some
> performance related performance pains, though I don't recall for sure)
> by simply splitting each of these incoming chunks into its own "row",
> but with the same key (the collection of which we traverse with the
> magic of cursors).
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Multiple entires for the same key in the strings table
Posted by Karl Fogel <kf...@red-bean.com>.
"C. Michael Pilato" <cm...@collab.net> writes:
> Vyacheslav V. Zholudev wrote:
>> Could somebody explain me please, why we can have multiple entries
>> for the same key in the 'strings' table in case of BDB? I mean what
>> are the reasons behind it?
>
> We get data piecemeal, in chunks, though the interfaces that write
> file contents to the 'strings' table. We used to simply append this
> data to the one "row" for that string ID, but then we noticed that
> Berkeley DB's write-ahead logging subsystem wanted to replicate the
> whole string-so-far value over and over again in the log.* files (so
> the transaction could be replayed if necessary). This caused
> obnoxious bloat of the disk. We solved this issue (and possibly some
> performance related performance pains, though I don't recall for sure)
> by simply splitting each of these incoming chunks into its own "row",
> but with the same key (the collection of which we traverse with the
> magic of cursors).
And Vyacheslav, if you don't find a comment already in the code
explaining this, then you could use Mike's explanation above to produce
a patch that adds such a comment in a useful location...
-Karl
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Multiple entires for the same key in the strings table
Posted by "C. Michael Pilato" <cm...@collab.net>.
Vyacheslav V. Zholudev wrote:
> Hey all!
>
> Could somebody explain me please, why we can have multiple entries for
> the same key in the 'strings' table in case of BDB? I mean what are the
> reasons behind it?
We get data piecemeal, in chunks, though the interfaces that write file
contents to the 'strings' table. We used to simply append this data to the
one "row" for that string ID, but then we noticed that Berkeley DB's
write-ahead logging subsystem wanted to replicate the whole string-so-far
value over and over again in the log.* files (so the transaction could be
replayed if necessary). This caused obnoxious bloat of the disk. We solved
this issue (and possibly some performance related performance pains, though
I don't recall for sure) by simply splitting each of these incoming chunks
into its own "row", but with the same key (the collection of which we
traverse with the magic of cursors).
--
C. Michael Pilato <cm...@collab.net>
CollabNet <> www.collab.net <> Distributed Development On Demand