You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Ilya Khlopotov <ii...@apache.org> on 2020/05/01 10:23:33 UTC
Re: [DISCUSS] Streaming API in CouchDB 4.0

> Maybe we should set a hard limit on the maximum doc ids size, 2-4KB?
> We have a config setting to do it already.  
I am +100 for stricter limit. We need to limit db name and doc ids. However I do agree that it is a bigger change and requires more discussion.

> Also was curious, in the latest proposal what will be included in the
> bookmark? Proposal said "The bookmark would include information needed
> to ensure proper pagination without the need to repeat initial
> parameters of the request".
The main idea of the proposal is to return a bookmark which can be used to request next page without specifying any additional parameters. Even more we would forbid setting any extra parameters when bookmark is present. Which means that it has to include all parameters passed to initial request. In the PoC I am working on I took an approach where I encode all non default fields of mrargs record. However this is an implementation detail. 

There is an interesting aspect of this approach which worth mentioning. In case when user pass the keys parameter we would include it into bookmark. However on a closer look this is not worse than what we have now. Currently if user wants to retrieve lots of documents with specified keys they have to use POST request and pass keys in each request. With bookmark support they would continue to use POST but they wouldn't need to provide the list of keys in each request because it would be encoded in the bookmark. We even can do a small optimization and remove already returned keys when we update bookmark to point to next page. Also bookmark is a compressed term.

Best regards,
iilyak

On 2020/04/30 17:35:18, Nick Vatamaniuc <va...@gmail.com> wrote: 
> Hi Ilya,
> 
> Maybe we should set a hard limit on the maximum doc ids size, 2-4KB?
> We have a config setting to do it already.  And we also have hard
> limit of 10KB for FDB keys. Due to a limitation in Erlang http header
> parser, used through mochiweb, 8KB is (was?) the limit based on the
> default socket receive buffer size (see [1] for gory details).
> 
> I was going to suggest 4KB initially but say users would want to
> specify a start_key and end_key + the dbname, so maybe 3KB or 2KB is a
> better option? It will be an incompatible change so that's something
> balance against.
> 
> Regarding setting server_name in the config, you're referring to the
> specific limitation of 8KB from the mochweb header parsing, that
> shouldn't include the domain name, just individual request + header
> lines. It would look like "GET /dbname/_all_docs?bookmark...".
> 
> Also was curious, in the latest proposal what will be included in the
> bookmark? Proposal said "The bookmark would include information needed
> to ensure proper pagination without the need to repeat initial
> parameters of the request".
> 
> [1] https://github.com/apache/couchdb/commit/d23025ebd7176f6c307ddf49902cf20b33bd55c4
> 
> 
> On Thu, Apr 30, 2020 at 10:23 AM Ilya Khlopotov <ii...@apache.org> wrote:
> >
> > There is a problem with representing `next`/`previous`/`first` as path. With 5kB sized doc keys and we could exceed max URL length (8192 bytes). This means we would have to support POST. The question is how to handle the case when the URL is greater than 8192. The problem is CouchDB don't knows it's DNS name so we don't know the safe value to compare with.
> >
> > Options are:
> > 1) always use POST for pagination
> > 2) add server_name to config and return error when bookmark length exceeds dynamically calculated threshold. The threshold would account for db name length, server_name length, port and scheme length.
> >   - what error to return?
> >
> > I think option number 2 is too subtle to implement.
> >
> > The downside of option 1 is it is a bit harder to use from the browser or curl.
> >
> > On 2020/04/29 17:27:59, Ilya Khlopotov <ii...@apache.org> wrote:
> > > I think I addressed all comments and created an RFC https://github.com/apache/couchdb-documentation/pull/530
> > >
> > > On 2020/04/28 11:56:15, Ilya Khlopotov <ii...@apache.org> wrote:
> > > > Hello,
> > > >
> > > > I would like to introduce second proposal.
> > > >
> > > > 1) Add new optional query field called `bookmark` (or `token`) to following endpoints
> > > >   - {db}/_all_docs
> > > >   - {db}/_all_docs/queries
> > > >   - _dbs_info
> > > >   - {db}/_design/{ddoc}/_view/{view}
> > > >   - {db}/_design/{ddoc}/_view/{view}/queries
> > > > 2) Add following additional fields into response:
> > > >    ```
> > > >     "first": {
> > > >         "href": "https://myserver.com/myddb/_all_docs?limit=50&descending=true"
> > > >     },
> > > >     "previous": {
> > > >          "href": "https://myserver.com/myddb/_all_docs?bookmark=983uiwfjkdsdf"
> > > >     },
> > > >     "next": {
> > > >         "href": "https://myserver.com/myddb/_all_docs?bookmark=12343tyekf3"
> > > >      },
> > > >      ```
> > > > 3) Implement per-endpoint configurable max limits
> > > >    ```
> > > >    [request_limits]
> > > >   _all_docs = 5000
> > > >   _all_docs/queries = 5000
> > > >   _all_dbs = 5000
> > > >   _dbs_info = 5000
> > > >   _view = 2500
> > > >   _view/queries = 2500
> > > >   _find = 2500
> > > >   ```
> > > > 4) Implement following semantics:
> > > >    - The bookmark would be opaque token and would include information needed to ensure proper pagination without the need to repeat initial parameters of the request. In fact we might prohibit setting additional parameters when bookmark query field is specified.
> > > >    - don't use delayed responses when `bookmark` field is provided
> > > >    - don't use delayed responses when `limit` query key is specified and when it is below the max limit
> > > >    - return 400 when limit query key is specified and it is greater than the max limit
> > > >    - return 400 when we stream rows (in case when `limit` query key wasn't specified) and reach max limit
> > > >    - the `previous`/`next`/`first` keys are optional and we omit them for the cases they don't make sense
> > > >
> > > > Latter on we would introduce API versioning and deal with `{db}/_changes` and `_all_docs` endpoints.
> > > >
> > > > Questions:
> > > > - `bookmark` vs `token`?
> > > > - should we prohibit setting other fields when bookmark is set?
> > > > - `previous`/`next`/`first` as href vs token value itself (i.e. `{"previous": "983uiwfjkdsdf", "next": "12343tyekf3", "first": "iekjhfwo034"}`)
> > > >
> > > > Best regards,
> > > > iilyak
> > > >
> > > > On 2020/04/22 20:18:57, Ilya Khlopotov <ii...@apache.org> wrote:
> > > > > Hello everyone,
> > > > >
> > > > > Based on the discussions on the thread I would like to propose a number of first steps:
> > > > > 1) introduce new endpoints
> > > > >   - {db}/_all_docs/page
> > > > >   - {db}/_all_docs/queries/page
> > > > >   - _all_dbs/page
> > > > >   - _dbs_info/page
> > > > >   - {db}/_design/{ddoc}/_view/{view}/page
> > > > >   - {db}/_design/{ddoc}/_view/{view}/queries/page
> > > > >   - {db}/_find/page
> > > > >
> > > > > These new endpoints would act as follows:
> > > > > - don't use delayed responses
> > > > > - return object with following structure
> > > > >   ```
> > > > >   {
> > > > >      "total": Total,
> > > > >      "bookmark": base64 encoded opaque value,
> > > > >      "completed": true | false,
> > > > >      "update_seq": when available,
> > > > >      "page": current page number,
> > > > >      "items": [
> > > > >      ]
> > > > >   }
> > > > >   ```
> > > > > - the bookmark would include following data (base64 or protobuff???):
> > > > >   - direction
> > > > >   - page
> > > > >   - descending
> > > > >   - endkey
> > > > >   - endkey_docid
> > > > >   - inclusive_end
> > > > >   - startkey
> > > > >   - startkey_docid
> > > > >   - last_key
> > > > >   - update_seq
> > > > >   - timestamp
> > > > >   ```
> > > > >
> > > > > 2) Implement per-endpoint configurable max limits
> > > > > ```
> > > > > _all_docs = 5000
> > > > > _all_docs/queries = 5000
> > > > > _all_dbs = 5000
> > > > > _dbs_info = 5000
> > > > > _view = 2500
> > > > > _view/queries = 2500
> > > > > _find = 2500
> > > > > ```
> > > > >
> > > > > Latter (after few years) CouchDB would deprecate and remove old endpoints.
> > > > >
> > > > > Best regards,
> > > > > iilyak
> > > > >
> > > > > On 2020/02/19 22:39:45, Nick Vatamaniuc <va...@apache.org> wrote:
> > > > > > Hello everyone,
> > > > > >
> > > > > > I'd like to discuss the shape and behavior of streaming APIs for CouchDB 4.x
> > > > > >
> > > > > > By "streaming APIs" I mean APIs which stream data in row as it gets
> > > > > > read from the database. These are the endpoints I was thinking of:
> > > > > >
> > > > > >  _all_docs, _all_dbs, _dbs_info  and query results
> > > > > >
> > > > > > I want to focus on what happens when FoundationDB transactions
> > > > > > time-out after 5 seconds. Currently, all those APIs except _changes[1]
> > > > > > feeds, will crash or freeze. The reason is because the
> > > > > > transaction_too_old error at the end of 5 seconds is retry-able by
> > > > > > default, so the request handlers run again and end up shoving the
> > > > > > whole request down the socket again, headers and all, which is
> > > > > > obviously broken and not what we want.
> > > > > >
> > > > > > There are few alternatives discussed in couchdb-dev channel. I'll
> > > > > > present some behaviors but feel free to add more. Some ideas might
> > > > > > have been discounted on the IRC discussion already but I'll present
> > > > > > them anyway in case is sparks further conversation:
> > > > > >
> > > > > > A) Do what _changes[1] feeds do. Start a new transaction and continue
> > > > > > streaming the data from the next key after last emitted in the
> > > > > > previous transaction. Document the API behavior change that it may
> > > > > > present a view of the data is never a point-in-time[4] snapshot of the
> > > > > > DB.
> > > > > >
> > > > > >  - Keeps the API shape the same as CouchDB <4.0. Client libraries
> > > > > > don't have to change to continue using these CouchDB 4.0 endpoints
> > > > > >  - This is the easiest to implement since it would re-use the
> > > > > > implementation for _changes feed (an extra option passed to the fold
> > > > > > function).
> > > > > >  - Breaks API behavior if users relied on having a point-in-time[4]
> > > > > > snapshot view of the data.
> > > > > >
> > > > > > B) Simply end the stream. Let the users pass a `?transaction=true`
> > > > > > param which indicates they are aware the stream may end early and so
> > > > > > would have to paginate from the last emitted key with a skip=1. This
> > > > > > will keep the request bodies the same as current CouchDB. However, if
> > > > > > the users got all the data one request, they will end up wasting
> > > > > > another request to see if there is more data available. If they didn't
> > > > > > get any data they might have a too large of a skip value (see [2]) so
> > > > > > would have to guess different values for start/end keys. Or impose max
> > > > > > limit for the `skip` parameter.
> > > > > >
> > > > > > C) End the stream and add a final metadata row like a "transaction":
> > > > > > "timeout" at the end. That will let the user know to keep paginating
> > > > > > from the last key onward. This won't work for `_all_dbs` and
> > > > > > `_dbs_info`[3] Maybe let those two endpoints behave like _changes
> > > > > > feeds and only use this for views and and _all_docs? If we like this
> > > > > > choice, let's think what happens for those as I couldn't come up with
> > > > > > anything decent there.
> > > > > >
> > > > > > D) Same as C but to solve the issue with skips[2], emit a bookmark
> > > > > > "key" of where the iteration stopped and the current "skip" and
> > > > > > "limit" params, which would keep decreasing. Then user would pass
> > > > > > those in "start_key=..." in the next request along with the limit and
> > > > > > skip params. So something like "continuation":{"skip":599, "limit":5,
> > > > > > "key":"..."}. This has the same issue with array results for
> > > > > > `_all_dbs` and `_dbs_info`[3].
> > > > > >
> > > > > > E) Enforce low `limit` and `skip` parameters. Enforce maximum values
> > > > > > there such that response time is likely to fit in one transaction.
> > > > > > This could be tricky as different runtime environments will have
> > > > > > different characteristics. Also, if the timeout happens there isn't a
> > > > > > a nice way to send an HTTP error since we already sent the 200
> > > > > > response. The downside is that this might break how some users use the
> > > > > > API, if say the are using large skips and limits already. Perhaps here
> > > > > > we do both B and D, such that if users want transactional behavior,
> > > > > > they specify that `transaction=true` param and only then we enforce
> > > > > > low limit and skip maximums.
> > > > > >
> > > > > > F) At least for `_all_docs` it seems providing a point-in-time
> > > > > > snapshot view doesn't necessarily need to be tied to transaction
> > > > > > boundaries. We could check the update sequence of the database at the
> > > > > > start of the next transaction and if it hasn't changed we can continue
> > > > > > emitting a consistent view. This can apply to C and D and would just
> > > > > > determine when the stream ends. If there are no writes happening to
> > > > > > the db, this could potential streams all the data just like option A
> > > > > > would do. Not entirely sure if this would work for views.
> > > > > >
> > > > > > So what do we think? I can see different combinations of options here,
> > > > > > maybe even different for each API point. For example `_all_dbs`,
> > > > > > `_dbs_info` are always A, and `_all_docs` and views default to A but
> > > > > > have parameters to do F, etc.
> > > > > >
> > > > > > Cheers,
> > > > > > -Nick
> > > > > >
> > > > > > Some footnotes:
> > > > > >
> > > > > > [1] _changes feeds is the only one that works currently. It behaves as
> > > > > > per RFC https://github.com/apache/couchdb-documentation/blob/master/rfcs/003-fdb-seq-index.md#access-patterns.
> > > > > > That is, we continue streaming the data by resetting the transaction
> > > > > > object and restarting from the last emitted key (db sequence in this
> > > > > > case). However, because the transaction restarts if a document is
> > > > > > updated while the streaming take place, it may appear in the _changes
> > > > > > feed twice. That's a behavior difference from CouchDB < 4.0 and we'd
> > > > > > have to document it, since previously we presented this point-in-time
> > > > > > snapshot of the database from when we started streaming.
> > > > > >
> > > > > > [2] Our streaming APIs have both skips and limits. Since FDB doesn't
> > > > > > currently support efficient offsets for key selectors
> > > > > > (https://apple.github.io/foundationdb/known-limitations.html#dont-use-key-selectors-for-paging)
> > > > > > we implemented skip by iterating over the data. This means that a skip
> > > > > > of say 100000 could keep timing out the transaction without yielding
> > > > > > any data.
> > > > > >
> > > > > > [3] _all_dbs and _dbs_info return a JSON array so they don't have an
> > > > > > obvious place to insert a last metadata row.
> > > > > >
> > > > > > [4] For example they have a constraint that documents "a" and "z"
> > > > > > cannot both be in the database at the same time. But when iterating
> > > > > > it's possible that "a" was there at the start. Then by the end, "a"
> > > > > > was removed and "z" added, so both "a" and "z" would appear in the
> > > > > > emitted stream. Note that FoundationDB has APIs which exhibit the same
> > > > > > "relaxed" constrains:
> > > > > > https://apple.github.io/foundationdb/api-python.html#module-fdb.locality
> > > > > >
> > > > >
> > > >
> > >
>