You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@couchdb.apache.org by wo...@apache.org on 2020/11/30 18:55:31 UTC

[couchdb-documentation] branch main updated: Add pagination proposal (#530)

This is an automated email from the ASF dual-hosted git repository.

wohali pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/couchdb-documentation.git


The following commit(s) were added to refs/heads/main by this push:
     new 6cd170f  Add pagination proposal (#530)
6cd170f is described below

commit 6cd170fa84f5473f5a593f07426e174b931d771b
Author: iilyak <ii...@users.noreply.github.com>
AuthorDate: Mon Nov 30 10:55:21 2020 -0800

    Add pagination proposal (#530)
    
    * Add pagination proposal
    
    * POST is not supported
    
    * Fix typo
    
    * Move '_dbs_info' out of scope
    
    * Return token only
    
    * We cannot return 400 for streaming replies
    
    * Make 'skip' parameter subject to limit
    
    * Document additional semantics
    
    * cosmetics
---
 rfcs/014-pagination.md | 215 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 215 insertions(+)

diff --git a/rfcs/014-pagination.md b/rfcs/014-pagination.md
new file mode 100644
index 0000000..43dd136
--- /dev/null
+++ b/rfcs/014-pagination.md
@@ -0,0 +1,215 @@
+---
+name: Pagination support
+about: Add bookmark based pagination interface
+title: 'Pagination API'
+labels: rfc, discussion
+assignees: ''
+
+---
+
+Implement pagination support for query like endpoints
+
+# Introduction
+
+The main driver for this proposal is the introduction of FoundationDB as a
+storage engine. FoundationDB imposes limits on transaction duration and size.
+Therefore we need to find way to restrict amount of data we return to customers.
+We could simply set the maximum `limit` which would cap amount of rows client can
+request. However this "solution" has a big disadvantage. Which is it would require
+clients to write pagination recipe in their code. Current pagination scheme
+requires complex logic on the client side. There are quite a few corner cases
+to handle.
+
+## Abstract
+
+The main addition is to add new bookmark based pagination scheme to all quiery
+like endpoints. As a first step we would keep `_all_dbs`, `_dbs_info` and
+`_changes` out of scope for the following reasons:
+- _all_dbs - the endpoint returns a list instead of object as the rest of the endpoints
+- _dbs_info - the endpoint returns a list instead of object as the rest of the endpoints
+- _changes - the endpoint contains too many different modes and it would require
+  more careful consideration
+
+The endpoints in scope are:
+- {db}/_all_docs
+- {db}/_all_docs/queries
+- {db}/_design/{ddoc}/_view/{view}
+- {db}/_design/{ddoc}/_view/{view}/queries
+
+In a nutshell the idea is:
+- add `page_size` query field to control number of rows on each page and to flag
+  that client expects paginated response
+- add `first`, `previous`, `next` fields which contain bookmark part of URI
+- add `bookmark` query field to retrieve bookmarked page
+
+
+## Requirements Language
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+"SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
+document are to be interpreted as described in
+[RFC 2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+## Terminology
+
+- `bookmark` - is opaque token which would contain information needed to retrieve
+  bookmarked page. The format of the token value MUST NOT be relied upon in the client.
+
+---
+
+# Detailed Description
+
+## Implementation proposal
+
+1) Add new optional query field called `bookmark` to following endpoints
+  - {db}/_all_docs
+  - {db}/_all_docs/queries
+  - {db}/_design/{ddoc}/_view/{view}
+  - {db}/_design/{ddoc}/_view/{view}/queries
+
+2) Add new optional query field called `page_size` if it is set we would use
+  paginated endpoint otherwise use old code path
+3) Implement per-endpoint configurable max limits to restrict page size of
+  paginated responses
+
+ ```
+ [request_limits]
+ _all_docs = 5000
+ _all_docs/queries = 5000
+ _all_dbs = 5000
+ _dbs_info = 5000
+ _view = 2500
+ _view/queries = 2500
+ _find = 2500
+ ```
+4) Add following additional fields into response:
+  ```
+    "first": "12345678945621321689",
+    "previous": "983uiwfjkdsdf",
+    "next": "12343tyekf3"
+  ```
+
+## Limitations
+
+- The `first`/`next`/`last` keys in the response are represented as path which
+  includes the bookmark query key. This means the bookmark token size contributes
+  to total URI length and is subject to a max URL lenght (around 2000 characters).
+  This means storing `keys` in a bookmark is not an option. For that reason
+  `POST` method is not supported when pagination is enabled
+- Ideally we would want to signal (return 400) when number of rows returned from
+  streaming version of the endpoint goes over limit configured in `request_limit`.
+  However with streaming we've aready sent a return code.
+
+## Semantics of the implementation
+
+- Only GET method would have pagination support
+- The bookmark would include information needed to ensure proper pagination
+  without the need to repeat initial parameters of the request.
+- Don't use delayed responses when `bookmark` field is provided
+- Don't use delayed responses when `page_size` query key is specified and when
+  it is below the max limit
+- Return 400 when `bookmark` field is set and other query fields are present
+- Return 400 when `page_size` query key is specified and it is greater than
+  the max limit
+- The `previous`/`next`/`first` keys are optional and we omit them for the cases
+  they don't make sense
+- Set default value for `page_size` to `limit` if provided `limit` is less than
+  value configured in `request_limit` of `default.ini` for the given endpoint
+- Set default value for `page_size` to the value configured in `request_limit`
+  of `default.ini` for the given endpoint
+- Once the `limit` is reached the final response will not have a "next" bookmark
+- The maximum value for `skip` query parameter is limited to the same `page_size` or
+  value configured in `request_limit` whatever is less
+- Once the underlying call to FoundationDB returns less than `page_size`
+  the response will not have a "next" bookmark
+- When `page_size` is used with `_all_docs/queries` or `{db}/_design/{ddoc}/_view/{view}/queries`
+  the specified limit applies to number of queries provided in the request.
+- For `_all_docs/queries` and `{db}/_design/{ddoc}/_view/{view}/queries` the total
+  number of rows returned shouldn't exceed provided `page_size` or configured
+  max limit (whatever is less)
+- Paginated requests are subject to FDB transaction timeout. This is implemented
+  via lack of `{restart_tx, true}` option for FDB calls.
+- The request to `_all_docs/queries` and `{db}/_design/{ddoc}/_view/{view}/queries`
+  can include bookmarks:
+  ```
+  {"queries": [
+    {"bookmark": "bookmarkForQuery1PageL"},
+    {"bookmark": "bookmarkForQuery2PageM"},
+    {"bookmark": "bookmarkForQuery3PageN"}
+    ]
+  }
+  ```
+- Every bookmark returned by `_all_docs/queries` and `{db}/_design/{ddoc}/_view/{view}/queries`
+  can be submited via separate request to `_all_docs` and `{db}/_design/{ddoc}/_view/{view}`
+  correspondly.
+
+
+## Configuration
+
+The page size limits are configured in `default.ini` (or another `ini` file) in
+`request_limit` section as follows:
+
+```
+ [request_limits]
+ _all_docs = 5000
+ _all_docs/queries = 5000
+ _all_dbs = 5000
+ _dbs_info = 5000
+ _view = 2500
+ _view/queries = 2500
+ _find = 2500
+```
+
+## Roadmap
+
+- initial implementation as described in this document
+- create API versioning proposal and implement the feature
+- create separate proposal for `_changes` endpoint
+- implement pagination enabled version of `_all_dbs` and `_dbs_info` which would
+  change response type to be object (using versioned API feature)
+
+# Key Changes
+
+- New configuration section
+- New query fields
+- New fields in response body
+- Enforcing strict limits on number of rows requested by client
+
+## Applications and Modules affected
+
+- chttpd
+
+## HTTP API additions
+
+N/A
+
+## HTTP API deprecations
+
+N/A
+
+# Security Considerations
+
+No changes to the security model
+
+# References
+
+- [Streaming API in CouchDB 4.0 discussion](https://lists.apache.org/thread.html/r02cee7045cac4722e1682bb69ba0ec791f5cce025597d0099fb34033%40%3Cdev.couchdb.apache.org%3E)
+- [Streaming API in CouchDB 4.0 discussion part 2](https://lists.apache.org/thread.html/ra8d16937cca332207d772844d2789f932fbc4572443a354391663b9c%40%3Cdev.couchdb.apache.org%3E)
+- [API versioning discussion](https://lists.apache.org/thread.html/rcc742c0fdca0363bb338b54526045720868597ea35ee6842aef174e0%40%3Cdev.couchdb.apache.org%3E)
+
+# Acknowledgements
+
+[TIP]:  # ( Who helped you write this RFC? )
+
+Thank you to all partitipants in the discussion threads. You all helped to shape
+and refine this proposal in one form or another.
+
+- Adam Kocoloski
+- Garren Smith
+- Glynn Bird
+- Joan Touzet
+- Mike Rhodes
+- Nick Vatamaniuc
+- Paul Davis
+- Richard Ellis
+- Robert Samuel Newson