You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Antony Blakey <an...@gmail.com> on 2008/12/21 05:10:57 UTC
Re: couchdb (_external consistency issues and proposals)
(Posted to -dev because it has some development issues)
This is wrong BTW:
> elsif doc["Type"] == "user"
> doc["Roles"] && doc["Roles"].each do |r|
> db.execute("replace into links values (?, ?, ?)",
> db_name, doc_id, r);
> end
because it doesn't handle modifications correctly. In my production
code I do this:
db.execute("delete from links where db = ? and src = ?", db_name,
doc_id);
doc["Roles"] && doc["Roles"].each do |r|
db.execute("insert into links values (?, ?, ?)", db_name, doc_id,
r);
end
i.e. always delete and recreate the derived document. You can do
incremental updates by reading from your indexes before updating. You
cannot reliably get the previous rev (for differencing) because it may
not exist.
My code also doesn't handle a database being deleted and then re-
created - the _external will think it has valid records, but they
belong to a previous database. You could do that through
notifications, but once again I think it needs to be synchronous if
you want to reason about it. A likely-to-work-most-of-the-time
solution would be to detect update_seq < stored_update_seq. A better
solution would be for each db to have a UUID, so that you don't have
to rely on the name as the identity.
Also, if your _external doesn't get triggered for a long time, and
while it's 'dormant' a document is deleted and the db is compacted,
you could miss deletions. One solution to that is that every _external
needs to be notified (synchronously) before a compaction so that it
can update to the update_seq of the MVCC snapshot that the compaction
will operate against. IMO a better solution is to have two UUID's for
the database - one is per database, and one is 'per compaction'. Thus
an external will know if it needs to revalidate all the documents it
has indexed to check for missed deletions updates. You could just have
a per-compaction UUID, which would change if a db was deleted and then
created, this triggering the same codepath, but this is a lot more
expensive than knowing that the entire db
Finally, note that this external operates for *every* database,
whereas you may want to enable and configure it using a design
document. Thus your external should always monitor updated design
documents and check for enablement. You can record the configuration
in the database (and cache it in the _external) and just ignore all
other changes. Personally I don't bother because the lazy-creation
means that no work is done unless I do an _external query, so
databases which don't get queried, don't incur a cost, and I have no
configuration data.
That's another reason to prefer a passive UUID-based identity scheme
for db-create/delete and compaction detection rather than a
notification system.
It would be good if each DB had two UUIDs, one per-db and one per-
compaction i.e. changed in the MVCC snapshot during a compaction, and
that these be provided to every _external request.
Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
If at first you don’t succeed, try, try again. Then quit. No use being
a damn fool about it
-- W.C. Fields
Re: couchdb (_external consistency issues and proposals)
Posted by Antony Blakey <an...@gmail.com>.
On 22/12/2008, at 5:14 PM, Antony Blakey wrote:
> I now know that this is wrong, sorry. Document deletions are never
> 'lost', and hence there's no need to track compaction generations.
> That raises a very different issue I noted in 'History of deletion,
> and the interaction with compactions' on couchdb-dev, but it's
> nothing to do with _external.
Hmmm. Further digging reveals that the purge function will in fact
remove the record of deletions. Luckily there's a purge_seq value
supplied in the dbinfo result (and also in each _external call). By
tracking this, an _external knows when to revalidate it's documents.
Purging can break replication, especially in distributed systems
without centralized knowledge or control of replication status (which
is my situation).
Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
Lack of will power has caused more failure than lack of intelligence
or ability.
-- Flower A. Newhouse
Re: couchdb (_external consistency issues and proposals)
Posted by Antony Blakey <an...@gmail.com>.
On 22/12/2008, at 5:14 PM, Antony Blakey wrote:
> I now know that this is wrong, sorry. Document deletions are never
> 'lost', and hence there's no need to track compaction generations.
> That raises a very different issue I noted in 'History of deletion,
> and the interaction with compactions' on couchdb-dev, but it's
> nothing to do with _external.
Hmmm. Further digging reveals that the purge function will in fact
remove the record of deletions. Luckily there's a purge_seq value
supplied in the dbinfo result (and also in each _external call). By
tracking this, an _external knows when to revalidate it's documents.
Purging can break replication, especially in distributed systems
without centralized knowledge or control of replication status (which
is my situation).
Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
Lack of will power has caused more failure than lack of intelligence
or ability.
-- Flower A. Newhouse
Re: couchdb (_external consistency issues and proposals)
Posted by Antony Blakey <an...@gmail.com>.
On 21/12/2008, at 2:40 PM, Antony Blakey wrote:
> Also, if your _external doesn't get triggered for a long time, and
> while it's 'dormant' a document is deleted and the db is compacted,
> you could miss deletions. One solution to that is that every
> _external needs to be notified (synchronously) before a compaction
> so that it can update to the update_seq of the MVCC snapshot that
> the compaction will operate against. IMO a better solution is to
> have two UUID's for the database - one is per database, and one is
> 'per compaction'. Thus an external will know if it needs to
> revalidate all the documents it has indexed to check for missed
> deletions updates. You could just have a per-compaction UUID, which
> would change if a db was deleted and then created, this triggering
> the same codepath, but this is a lot more expensive than knowing
> that the entire db
I now know that this is wrong, sorry. Document deletions are never
'lost', and hence there's no need to track compaction generations.
That raises a very different issue I noted in 'History of deletion,
and the interaction with compactions' on couchdb-dev, but it's nothing
to do with _external.
Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
Reflecting on W.H. Auden's contemplation of 'necessary murders' in the
Spanish Civil War, George Orwell wrote that such amorality was only
really possible, 'if you are the kind of person who is always
somewhere else when the trigger is pulled'.
-- John Birmingham, "Appeasing Jakarta"
Re: couchdb (_external consistency issues and proposals)
Posted by Antony Blakey <an...@gmail.com>.
On 21/12/2008, at 2:40 PM, Antony Blakey wrote:
> Also, if your _external doesn't get triggered for a long time, and
> while it's 'dormant' a document is deleted and the db is compacted,
> you could miss deletions. One solution to that is that every
> _external needs to be notified (synchronously) before a compaction
> so that it can update to the update_seq of the MVCC snapshot that
> the compaction will operate against. IMO a better solution is to
> have two UUID's for the database - one is per database, and one is
> 'per compaction'. Thus an external will know if it needs to
> revalidate all the documents it has indexed to check for missed
> deletions updates. You could just have a per-compaction UUID, which
> would change if a db was deleted and then created, this triggering
> the same codepath, but this is a lot more expensive than knowing
> that the entire db
I now know that this is wrong, sorry. Document deletions are never
'lost', and hence there's no need to track compaction generations.
That raises a very different issue I noted in 'History of deletion,
and the interaction with compactions' on couchdb-dev, but it's nothing
to do with _external.
Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
Reflecting on W.H. Auden's contemplation of 'necessary murders' in the
Spanish Civil War, George Orwell wrote that such amorality was only
really possible, 'if you are the kind of person who is always
somewhere else when the trigger is pulled'.
-- John Birmingham, "Appeasing Jakarta"