You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Antony Blakey <an...@gmail.com> on 2009/03/02 22:41:32 UTC

Re: Why sequential document ids? [was: Re: What's the speed(performance) of couchdb?]

On 03/03/2009, at 5:20 AM, Chris Anderson wrote:

> On Sat, Feb 28, 2009 at 5:42 AM, Antony Blakey <antony.blakey@gmail.com 
> > wrote:
>>
>> What security issues are you thinking of?
>
> The one where you can tell by looking at a doc, which node it was
> inserted on. I suppose that's not strictly security - more like
> anonymity.

If you want to a feature such as replication tracking i.e. who have I  
replicated from, and how up-to-date am I, the ability to track the  
replication source chould would be a good thing. It would allow a  
computed version-vector form of staleness-tracking.

More would be required than just having a source identifier, because  
you would need to map that identifier to a domain useful to the user,  
but it would be a start.

I wonder if anonymity, even transitive anonymity, is that useful?

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

It is no measure of health to be well adjusted to a profoundly sick  
society.
   -- Jiddu Krishnamurti



Re: Why sequential document ids? [was: Re: What's the speed(performance) of couchdb?]

Posted by Antony Blakey <an...@gmail.com>.
On 03/03/2009, at 9:49 AM, Chris Anderson wrote:

> It's an information leak if you can tell the originating node because
> of the uuid. If we want an option to flag docs with information about
> the originating node, we should add that. Also, the doc.id isn't the
> right place for this, as what you want to know is the origination of
> updates, not documents.

Yes, sorry, I was confused - I thought we were talking about  
update_seqs.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

One should respect public opinion insofar as is necessary to avoid  
starvation and keep out of prison, but anything that goes beyond this  
is voluntary submission to an unnecessary tyranny.
   -- Bertrand Russell



Re: Why sequential document ids? [was: Re: What's the speed(performance) of couchdb?]

Posted by Chris Anderson <jc...@apache.org>.
On Mon, Mar 2, 2009 at 3:13 PM, Antony Blakey <an...@gmail.com> wrote:
> I guess it's a tradeoff between wanting a distributed overview and wanting
> global anonymity. Maybe that's a valid configuration option, rather than
> being policy.

It's an information leak if you can tell the originating node because
of the uuid. If we want an option to flag docs with information about
the originating node, we should add that. Also, the doc.id isn't the
right place for this, as what you want to know is the origination of
updates, not documents.

UUIDs that are consistent per-node are bad not just because you could
trace the node. There is even information leakage if you can tell that
a set of documents originated on the same node.

"Oh, the person who's participating in opposition-party discussions is
the person who filed these taxes."

-- 
Chris Anderson
http://jchris.mfdz.com

Re: Why sequential document ids? [was: Re: What's the speed(performance) of couchdb?]

Posted by Antony Blakey <an...@gmail.com>.
On 03/03/2009, at 9:26 AM, Dean Landolt wrote:

> I can see the point of replication source, but document origination  
> source
> is different than replication source, isn't it?

You are right. However the document origination source is required for  
a useful version vector, because you can then track which peer has how  
much of a source's writes. Which is useful if the source disappears.  
You can then build a distributed picture of the eventual consistency  
status.

> Yes.

I guess it's a tradeoff between wanting a distributed overview and  
wanting global anonymity. Maybe that's a valid configuration option,  
rather than being policy.

Unless a node makes it's identity available in some way other than  
replication, the source is still anonymous because there's no mapping  
of node id to source URL (which might change in any case). To identify  
the node you'd have to find the node, try to replicate from it, and  
get lucky that the id it provides is the one you're looking for.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Success is not the key to happiness. Happiness is the key to success.
  -- Albert Schweitzer


Re: Why sequential document ids? [was: Re: What's the speed(performance) of couchdb?]

Posted by Dean Landolt <de...@deanlandolt.com>.
On Mon, Mar 2, 2009 at 4:41 PM, Antony Blakey <an...@gmail.com>wrote:

>
> On 03/03/2009, at 5:20 AM, Chris Anderson wrote:
>
>  On Sat, Feb 28, 2009 at 5:42 AM, Antony Blakey <an...@gmail.com>
>> wrote:
>>
>>>
>>> What security issues are you thinking of?
>>>
>>
>> The one where you can tell by looking at a doc, which node it was
>> inserted on. I suppose that's not strictly security - more like
>> anonymity.
>>
>
> If you want to a feature such as replication tracking i.e. who have I
> replicated from, and how up-to-date am I, the ability to track the
> replication source chould would be a good thing. It would allow a computed
> version-vector form of staleness-tracking.


I can see the point of replication source, but document origination source
is different than replication source, isn't it?

>
>
> More would be required than just having a source identifier, because you
> would need to map that identifier to a domain useful to the user, but it
> would be a start.
>
> I wonder if anonymity, even transitive anonymity, is that useful?


Yes. And not just for copyright infringement apps.