You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Jan Lehnardt <ja...@apache.org> on 2022/09/05 12:25:24 UTC

Monthly Developer Meeting September

Hey all,

this Wednesday, Sept 7th is our next instalment of our new monthly developer meeting.

Picking up from the agenda of the last meeting, our topic for this month is the automatic removal of document tombstones.

I started writing up a document that could become the RFC for this meeting. The aim of this document for now is to capture all the things we ned to worry about when introducing the feature, which I think it does at the moment, but some of the sections need figuring out and expanding.

https://docs.google.com/document/d/1ZkJs3Lrk8YOmPHeVYSe_yTp5rkqqAQyAnSQ-VqZ0R-0/edit

Especially the bits about transitively determining the removal-floor from internal replication history docs currently eludes me and I hope we can sort this out during the meeting or after. See my notes below.

* * *

Meeting info:

Topic: CouchDB Developer Monthly

Time: Sep 7, 2022 06:00 PM Amsterdam, Berlin, Rome, Stockholm, Vienna

Please download and import the following iCalendar (.ics) files to your calendar system.
Monthly: https://us02web.zoom.us/meeting/tZIkf-mupzkqGtEDuJGLC0qDbd__vD0NBpNy/ics?icsToken=98tyKuGupzMqHN2XsBmCRpwAHYqgc-7xmHpfjbd4uArPJAxwMhLSNOZ9DaRbQPby

Join Zoom Meeting
https://us02web.zoom.us/j/86980692005?pwd=cXNGbXNFR3l1NERMazVKZnlPU1NHdz09

Meeting ID: 869 8069 2005
Passcode: 844221
One tap mobile
+16469313860,,86980692005#,,,,*844221# US
+16694449171,,86980692005#,,,,*844221# US

Dial by your location
        +1 646 931 3860 US
        +1 669 444 9171 US
        +1 669 900 6833 US (San Jose)
        +1 719 359 4580 US
        +1 929 436 2866 US (New York)
        +1 253 215 8782 US (Tacoma)
        +1 301 715 8592 US (Washington DC)
        +1 309 205 3325 US
        +1 312 626 6799 US (Chicago)
        +1 346 248 7799 US (Houston)
        +1 386 347 5053 US
        +1 564 217 2000 US
Meeting ID: 869 8069 2005
Passcode: 844221
Find your local number: https://us02web.zoom.us/u/kWhDCM0Z

* * *

I had posted these notes of mine in our dev chat, maybe someone here can get me over the missing bit:

> hm I think I keep being confused. internal replication checkpoint keep a running upper count of the source seq for a replication and a history which each entry marks a target_seq (https://github.com/apache/couchdb/blob/a1fc8075f3e86ec2242eedd2b1bbbd15758515e7/src/mem3/src/mem3_rpc.erl#L128-L146) in either case push or pull replication the top level seq in the checkpoint gets updated on each checkpoint history expansion and the history entry just shows the progress of the target, which doesn’t help finding out which replacement seq is a safe lower bound to make sure I get all the updates.

> that is, I don’t know how the =<TgtSeq here helps me any, if what I want to compare is the seq from the original shard against checkpoint entries on the replacement shard that only keeps the latest seq for the source

> it feels like there is an inversion that my brain doesnt want to follow

> hm, it looks like the link I shared isn’t where checkpoints are added, this code here (https://github.com/apache/couchdb/blob/a1fc8075f3e86ec2242eedd2b1bbbd15758515e7/src/mem3/src/mem3_rpc.erl#L267-L292) \ tracks a source_seq that would gel with how I understand this, but then the =< TgtSeq still eludes me (or we have a bug, which I don’t think we do), or that’s not the place where the the replacement seq is calculated.

> the way my brain wants this: given a seq from the original shard, go through the rep history on the replacement shard and find the history entry where the “I read up to here from the source” seq is <= the given seq, for that entry, take the “when did this replication start” seq and use that seq as the new since call to changes from the replacement shard.

> and I feel CouchDB has it implemented backwards, but I can’t see where/how (edited) 

> reading this just adds to the confusion: https://github.com/apache/couchdb/blob/main/src/mem3/src/mem3_rep.erl#L196-L207

Best
Jan
—

Re: Monthly Developer Meeting September

Posted by Jan Lehnardt <ja...@apache.org>.
Thanks all for coming, here are the notes, link to the recording included:

https://docs.google.com/document/d/1-P5oSF79AbhOtraSN2hlOnMAupPKi3rJGKK0ao1cfZA/edit#heading=h.a6knhzfaespk

The notes are a bit sparse compared to last time, but capture the rough topics and we just got into more detail, I just recorded the outcomes. There will be a more detailed writeup about the TTL feature in a the next few days, watch this space.

Best
Jan

— 
Professional Support for Apache CouchDB:
https://neighbourhood.ie/couchdb-support/

24/7 Observation for your CouchDB Instances:
https://opservatory.app

> On 5. Sep 2022, at 14:25, Jan Lehnardt <ja...@apache.org> wrote:
> 
> Hey all,
> 
> this Wednesday, Sept 7th is our next instalment of our new monthly developer meeting.
> 
> Picking up from the agenda of the last meeting, our topic for this month is the automatic removal of document tombstones.
> 
> I started writing up a document that could become the RFC for this meeting. The aim of this document for now is to capture all the things we ned to worry about when introducing the feature, which I think it does at the moment, but some of the sections need figuring out and expanding.
> 
> https://docs.google.com/document/d/1ZkJs3Lrk8YOmPHeVYSe_yTp5rkqqAQyAnSQ-VqZ0R-0/edit
> 
> Especially the bits about transitively determining the removal-floor from internal replication history docs currently eludes me and I hope we can sort this out during the meeting or after. See my notes below.
> 
> * * *
> 
> Meeting info:
> 
> Topic: CouchDB Developer Monthly
> 
> Time: Sep 7, 2022 06:00 PM Amsterdam, Berlin, Rome, Stockholm, Vienna
> 
> Please download and import the following iCalendar (.ics) files to your calendar system.
> Monthly: https://us02web.zoom.us/meeting/tZIkf-mupzkqGtEDuJGLC0qDbd__vD0NBpNy/ics?icsToken=98tyKuGupzMqHN2XsBmCRpwAHYqgc-7xmHpfjbd4uArPJAxwMhLSNOZ9DaRbQPby
> 
> Join Zoom Meeting
> https://us02web.zoom.us/j/86980692005?pwd=cXNGbXNFR3l1NERMazVKZnlPU1NHdz09
> 
> Meeting ID: 869 8069 2005
> Passcode: 844221
> One tap mobile
> +16469313860,,86980692005#,,,,*844221# US
> +16694449171,,86980692005#,,,,*844221# US
> 
> Dial by your location
>        +1 646 931 3860 US
>        +1 669 444 9171 US
>        +1 669 900 6833 US (San Jose)
>        +1 719 359 4580 US
>        +1 929 436 2866 US (New York)
>        +1 253 215 8782 US (Tacoma)
>        +1 301 715 8592 US (Washington DC)
>        +1 309 205 3325 US
>        +1 312 626 6799 US (Chicago)
>        +1 346 248 7799 US (Houston)
>        +1 386 347 5053 US
>        +1 564 217 2000 US
> Meeting ID: 869 8069 2005
> Passcode: 844221
> Find your local number: https://us02web.zoom.us/u/kWhDCM0Z
> 
> * * *
> 
> I had posted these notes of mine in our dev chat, maybe someone here can get me over the missing bit:
> 
>> hm I think I keep being confused. internal replication checkpoint keep a running upper count of the source seq for a replication and a history which each entry marks a target_seq (https://github.com/apache/couchdb/blob/a1fc8075f3e86ec2242eedd2b1bbbd15758515e7/src/mem3/src/mem3_rpc.erl#L128-L146) in either case push or pull replication the top level seq in the checkpoint gets updated on each checkpoint history expansion and the history entry just shows the progress of the target, which doesn’t help finding out which replacement seq is a safe lower bound to make sure I get all the updates.
> 
>> that is, I don’t know how the =<TgtSeq here helps me any, if what I want to compare is the seq from the original shard against checkpoint entries on the replacement shard that only keeps the latest seq for the source
> 
>> it feels like there is an inversion that my brain doesnt want to follow
> 
>> hm, it looks like the link I shared isn’t where checkpoints are added, this code here (https://github.com/apache/couchdb/blob/a1fc8075f3e86ec2242eedd2b1bbbd15758515e7/src/mem3/src/mem3_rpc.erl#L267-L292) \ tracks a source_seq that would gel with how I understand this, but then the =< TgtSeq still eludes me (or we have a bug, which I don’t think we do), or that’s not the place where the the replacement seq is calculated.
> 
>> the way my brain wants this: given a seq from the original shard, go through the rep history on the replacement shard and find the history entry where the “I read up to here from the source” seq is <= the given seq, for that entry, take the “when did this replication start” seq and use that seq as the new since call to changes from the replacement shard.
> 
>> and I feel CouchDB has it implemented backwards, but I can’t see where/how (edited) 
> 
>> reading this just adds to the confusion: https://github.com/apache/couchdb/blob/main/src/mem3/src/mem3_rep.erl#L196-L207
> 
> Best
> Jan
> —