You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Julian Foad <ju...@wandisco.com> on 2010/02/19 09:34:25 UTC

[PATCH] Define PRISTINE.checksum as always being SHA-1

Bert and I were just discussing the checksums in PRISTINE. We think its
primary key should be SHA-1 always. A secondary index can be built on
the MD5 column if required.

Can we patch the docco like this?

[[[
* subversion/libsvn_wc/wc-metadata.sql
  (PRISTINE): Define "checksum" as always being SHA-1.
    Document the uniqueness properties of both checksum fields.
    Document when "md5_checksum" can be null.
]]]

[[[
Index: subversion/libsvn_wc/wc-metadata.sql
===================================================================
--- subversion/libsvn_wc/wc-metadata.sql	(revision 911752)
+++ subversion/libsvn_wc/wc-metadata.sql	(working copy)
@@ -172,7 +172,9 @@
    and ACTUAL_NODE tables.
  */
 CREATE TABLE PRISTINE (
-  /* ### the hash algorithm (MD5 or SHA-1) is encoded in this value */
+  /* The SHA-1 checksum of the pristine text. This is a unique key. The
+     SHA-1 checksum of a pristine text is assumed to be unique among all
+     pristine texts referenced from this database. */
   checksum  TEXT NOT NULL PRIMARY KEY,
 
   /* ### enumerated values specifying type of compression. NULL implies
@@ -189,7 +191,8 @@
   refcount  INTEGER NOT NULL,
 
   /* Alternative MD5 checksum used for communicating with older
-     repositories. */
+     repositories. Not guaranteed to be unique among table rows.
+     NULL if not (yet) calculated. */
   md5_checksum  TEXT
   );
]]]

- Julian


Re: [PATCH] Define PRISTINE.checksum as always being SHA-1

Posted by Julian Foad <ju...@wandisco.com>.
Neels J Hofmeyr wrote:
> Julian Foad wrote:
> > Bert and I were just discussing the checksums in PRISTINE. We think its
> > primary key should be SHA-1 always. A secondary index can be built on
> > the MD5 column if required.
> > 
> > Can we patch the docco like this?
> 
> +1

OK, Committed revision 911804.

- Julian


> There was agreement on always using SHA1 hashes in the 'pristine store
> design' thread. See where Greg says "nono" in
> http://mail-archives.apache.org/mod_mbox/subversion-dev/201002.mbox/%3C6cca3db31002171652u15ae7e87ha5b09e4cb170af2a@mail.gmail.com%3E
> 
> The possibility of using MD5 to index was also named in that thread, but did
> not resonate well.
> 
> ~Neels


Re: [PATCH] Define PRISTINE.checksum as always being SHA-1

Posted by Neels J Hofmeyr <ne...@elego.de>.
Julian Foad wrote:
> Bert and I were just discussing the checksums in PRISTINE. We think its
> primary key should be SHA-1 always. A secondary index can be built on
> the MD5 column if required.
> 
> Can we patch the docco like this?

+1

There was agreement on always using SHA1 hashes in the 'pristine store
design' thread. See where Greg says "nono" in
http://mail-archives.apache.org/mod_mbox/subversion-dev/201002.mbox/%3C6cca3db31002171652u15ae7e87ha5b09e4cb170af2a@mail.gmail.com%3E

The possibility of using MD5 to index was also named in that thread, but did
not resonate well.

~Neels

> 
> [[[
> * subversion/libsvn_wc/wc-metadata.sql
>   (PRISTINE): Define "checksum" as always being SHA-1.
>     Document the uniqueness properties of both checksum fields.
>     Document when "md5_checksum" can be null.
> ]]]
> 
> [[[
> Index: subversion/libsvn_wc/wc-metadata.sql
> ===================================================================
> --- subversion/libsvn_wc/wc-metadata.sql	(revision 911752)
> +++ subversion/libsvn_wc/wc-metadata.sql	(working copy)
> @@ -172,7 +172,9 @@
>     and ACTUAL_NODE tables.
>   */
>  CREATE TABLE PRISTINE (
> -  /* ### the hash algorithm (MD5 or SHA-1) is encoded in this value */
> +  /* The SHA-1 checksum of the pristine text. This is a unique key. The
> +     SHA-1 checksum of a pristine text is assumed to be unique among all
> +     pristine texts referenced from this database. */
>    checksum  TEXT NOT NULL PRIMARY KEY,
>  
>    /* ### enumerated values specifying type of compression. NULL implies
> @@ -189,7 +191,8 @@
>    refcount  INTEGER NOT NULL,
>  
>    /* Alternative MD5 checksum used for communicating with older
> -     repositories. */
> +     repositories. Not guaranteed to be unique among table rows.
> +     NULL if not (yet) calculated. */
>    md5_checksum  TEXT
>    );
> ]]]
> 
> - Julian
> 
>