You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@couchdb.apache.org by Apache Wiki <wi...@apache.org> on 2008/11/02 02:13:32 UTC

[Couchdb Wiki] Update of "HTTP Document API" by MartinCzura

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The following page has been changed by MartinCzura:
http://wiki.apache.org/couchdb/HTTP_Document_API

The comment on the change is:
page creation

New page:
An introduction to the CouchDB HTTP document API.

== Naming/Addressing ==

Documents stored in a CouchDB have a DocID. DocIDs are case-sensitive string identifiers that uniquely identify a document. Two documents cannot have the same identifier in the same database, they are considered the same document.

{{{
http://localhost:5984/test/some_doc_id
http://localhost:5984/test/another_doc_id
http://localhost:5984/test/BA1F48C5418E4E68E5183D5BD1F06476
}}}

The above URLs point to ''some_doc_id'', ''another_doc_id'' and ''BA1F48C5418E4E68E5183D5B!D1F06476'' in the database ''test''.

=== Valid Document Ids ===

  Q: What's the rule on a valid document id? The examples suggest it's restricted to ''[a-zA-Z0-9_]''? What about multi-byte UTF-8 characters? Any other non alphanums other than ''_''?

  A: There is no restriction yet on document ids at the database level. However, I haven't tested what happens when you try to use multibyte in the URL. It could be it "just works", but most likely there is a multi-byte char escaping/encoding/decoding step that needs to be done somewhere. For now, I'd just stick with valid URI characters and nothing "special".

  The reason database names have strict restrictions is to simplify database name-to-file mapping. Since databases will need to replicate across operating systems, the file naming scheme needed to be the lowest common denominator.

== JSON ==

A CouchDB document is simply a JSON object. (Along with metadata revision info if ''?full=true'' is in the URL query arguments.

This is an example document:

{{{
{
 "_id":"discussion_tables",
 "_rev":"D1C946B7",
 "Subject":"I like Plankton",
 "Author":"Rusty",
 "PostedDate":"2006-08-15T17:30:12-04:00",
 "Tags":["plankton", "baseball", "decisions"],
 "Body":"I decided today that I don't like baseball. I like plankton."
}
}}}

The document can be an arbitrary JSON object, but note that any top-level fields with a name that starts with a ''_'' prefix are reserved for use by CouchDB itself. Common examples for such fields are ''_id'' and ''_rev'', as shown above.

Another example:

{{{
{
 "_id":"discussion_tables",
 "_rev":"D1C946B7",
 "Subrise":true,
 "Sunset":false,
 "FullHours":[1,2,3,4,5,6,7,8,9,10],
 "Activities": [
   {"Name":"Football", "Duration":2, "DurationUnit":"Hours"},
   {"Name":"Breakfast", "Duration":40, "DurationUnit":"Minutes", "Attendees":["Jan", "Damien", "Laura", "Gwendolyn", "Roseanna"]}
 ]
}
}}}

Note that by default the structure is flat; in this case, the ''Activities'' attribute is structure imposed by the user.

== All Documents ==

To get a listing of all documents in a database, use the special ''_all_docs'' URI:

{{{
GET somedatabase/_all_docs HTTP/1.0
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
}}}

Will return a listing of all documents and their revision IDs, ordered by DocID (case sensitive):

{{{
HTTP/1.1 200 OK
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{
  "total_rows": 3, "offset": 0, "rows": [
    {"id": "doc1", "key": "doc1", "value": {"rev": "4324BB"}},
    {"id": "doc2", "key": "doc2", "value": {"rev":"2441HF"}},
    {"id": "doc3", "key": "doc3", "value": {"rev":"74EC24"}}
  ]
}
}}}

Use the query argument ''descending=true'' to reverse the order of the output table:

Will return the same as before but in reverse order:

{{{
HTTP/1.1 200 OK
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{
  "total_rows": 3, "offset": 0, "rows": [
    {"id": "doc3", "key": "doc3", "value": {"_rev":"74EC24"}}
    {"id": "doc2", "key": "doc2", "value": {"_rev":"2441HF"}},
    {"id": "doc1", "key": "doc1", "value": {"_rev": "4324BB"}},
  ]
}
}}}

The query string parameters ''startkey'' and ''count'' may also be used to limit the result set. For example:

{{{
GET somedatabase/_all_docs?startkey=doc2&count=2 HTTP/1.0
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
}}}

Will return:

{{{
HTTP/1.1 200 OK
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{
  "total_rows": 3, "offset": 1, "rows": [
    {"id": "doc2", "key": "doc2", "value": {"_rev":"2441HF"}},
    {"id": "doc3", "key": "doc3", "value": {"_rev":"74EC24"}}
  ]
}
}}}

And combined with ''descending'':

{{{
GET somedatabase/_all_docs?startkey=doc2&count=2&descending=true HTTP/1.0
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
}}}

Will return:

{{{
HTTP/1.1 200 OK
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{
  "total_rows": 3, "offset": 1, "rows": [
    {"id": "doc3", "key": "doc3", "value": {"_rev":"74EC24"}}
    {"id": "doc2", "key": "doc2", "value": {"_rev":"2441HF"}},
  ]
}
}}}

== Working With Documents Over HTTP ==

=== GET ===

To retrieve a document, simply perform a ''GET'' operation at the document's URL:

{{{
GET /somedatabase/some_doc_id HTTP/1.0
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
}}}

Here is the server's response:

{{{
HTTP/1.1 200 OK
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{
 "_id":"123BAC",
 "_rev":"946B7D1C",
 "Subject":"I like Planktion",
 "Author":"Rusty",
 "PostedDate":"2006-08-15T17:30:12Z-04:00",
 "Tags":["plankton", "baseball", "decisions"],
 "Body":"I decided today that I don't like baseball. I like plankton."
}
}}}

=== Accessing Previous Revisions ===

See ["DocumentRevisions"] for additional notes on revisions.

The above example gets the current revision. You can get a specific revision by using the following syntax:

{{{
GET /somedatabase/some_doc_id?rev=946B7D1C HTTP/1.0
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
}}}

To find out what revisions are available for a document, you can do:

{{{
GET /somedatabase/some_doc_id?revs=true HTTP/1.0
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
}}}

This returns the current revision of the document, but with an additional field, ''_revs'', the value being a list of the available revision IDs. Note though that not every of those revisions of the document is necessarily still stored on disk. For example, the content of an old revision may get removed by compacting the database, or it may only exist in a different database if it was replicated.

To get more detailed information about the available document revisions, use the ''revs_info'' parameter instead. In this case, the JSON result will contain a ''_revs_info'' property, which is an array of objects, for example:

{{{
{
  "_revs_info": [
    {"rev": "123456", "status": "disk"},
    {"rev": "234567", "status": "missing"},
    {"rev": "345678", "status": "deleted"},
  ]
}
}}}

Here, ''disk'' means the revision content is stored on disk and can still be retrieved. The other values indicate that the content of that revision is not available.

=== PUT ===

To create new document you can either use a ''POST'' operation or a ''PUT'' operation. To create/update a named document using the PUT operation, the URL must point to the document's location.

The following is an example HTTP ''PUT''. It will cause the CouchDB server to generate a new revision ID and save the document with it.

{{{
PUT /somedatabase/some_doc_id HTTP/1.0
Content-Length: 245
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json

{
  "Subject":"I like Planktion",
  "Author":"Rusty",
  "PostedDate":"2006-08-15T17:30:12-04:00",
  "Tags":["plankton", "baseball", "decisions"],
  "Body":"I decided today that I don't like baseball. I like plankton."
}
}}}

Here is the server's response.

{{{
HTTP/1.1 201 OK
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{"ok": true, "id": "some_doc_id", "rev": "946B7D1C"}
}}}

To update an existing document, you also issue a ''PUT'' request. In this case, the JSON body must contain a ''_rev'' property, which lets CouchDB know which revision the edits are based on. If the revision of the document currently stored in the database doesn't match, then a ''409'' conflict error is returned.

If the revision number does match what's in the database, a new revision number is generated and returned to the client.

For example:

{{{
PUT /somedatabase/some_doc_id HTTP/1.0
Content-Length: 245
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json

{
  "_id":"some_doc_id",
  "_rev":"946B7D1C",
  "Subject":"I like Planktion",
  "Author":"Rusty",
  "PostedDate":"2006-08-15T17:30:12-04:00",
  "Tags":["plankton", "baseball", "decisions"],
  "Body":"I decided today that I don't like baseball. I like plankton."
}
}}}

Here is the server's response if what is stored in the database is revision ''946B7D1C'' of document ''some_doc_id''.

{{{
HTTP/1.1 201 OK
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{"ok":true, "id":"some_doc_id", "rev":"946B7D1C"}
}}}

And here is the server's response if there is an update conflict (what is currently stored in the database is not revision ''946B7D1C'' of document ''some_doc_id'').

{{{
HTTP/1.1 409 CONFLICT
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Length: 33
Connection: close

{"error":{"id":"conflict","reason":"3073715634"}}
}}}

=== POST ===

The ''POST'' operation can be used to create a new document with a server generated DocID. To create a named document, use the ''PUT'' method instead. It is recommended that you avoid ''POST'' when possible, because proxies and other network intermediaries will occasionally resend ''POST'' requests, which can result in duplicate document creation. If your client software is not capable of generating cryptographically secure UUIDs, use a ''POST'' to ''/_uuids?count=100'' to retrieve a list of unused document IDs for future ''PUT'' requests.

The following is an example HTTP ''POST''. It will cause the CouchDB server to generate a new DocID and revision ID and save the document with it.

{{{
POST /somedatabase/ HTTP/1.0
Content-Length: 245
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json

{
  "Subject":"I like Planktion",
  "Author":"Rusty",
  "PostedDate":"2006-08-15T17:30:12-04:00",
  "Tags":["plankton", "baseball", "decisions"],
  "Body":"I decided today that I don't like baseball. I like plankton."
}
}}}

Here is the server's response:

{{{
HTTP/1.1 201 Created
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{"ok":true, "id":"123BAC", "rev":"946B7D1C"}
}}}

=== Modify Multiple Documents With a Single Request ===

CouchDB provides a bulk insert/update feature. To use this, you make a ''POST'' request to the URI ''/{dbname}/_bulk_docs'', with the request body being a JSON document containing a list of new documents to be inserted or updated. The bulk post is a transactional operation - all updates/insertions succeed, or all fail. 

Doc formats below are as per CouchDB 0.8.0.

{{{
{
  "docs": [
    {"_id": "0", "integer": 0, "string": "0"},
    {"_id": "1", "integer": 1, "string": "1"},
    {"_id": "2", "integer": 2, "string": "2"}
  ]
}
}}}

If you omit the per-document ''_id'' specification, CouchDB will generate unique IDs for you, as it does for regular ''POST'' requests to the database URI.

The response to such a bulk request would look as follows:

{{{
{
  "ok":true,
  "new_revs": [
    {"id": "0", "rev": "3682408536"},
    {"id": "1", "rev": "3206753266"},
    {"id": "2", "rev": "426742535"}
  ]
}
}}}

Updating existing documents requires setting the ''_rev'' member to the revision being updated. To delete a document set the ''_deleted'' member to true.

{{{
{
  "docs": [
    {"_id": "0", "_rev": "3682408536", "_deleted": true},
    {"_id": "1", "_rev": "3206753266", "integer": 2, "string": "2"},
    {"_id": "2", "_rev": "426742535", "integer": 3, "string": "3"}
  ]
}
}}}

Note that CouchDB will return in the response an id and revision for every document passed as content to a bulk insert, even for those that were just deleted.

=== DELETE ===

To delete a document, perform a ''DELETE'' operation at the document's location, passing the ''rev'' parameter with the document's current revision. If successful, it will return the revision id for the deletion stub.

{{{
DELETE /somedatabase/some_doc?rev=1582603387 HTTP/1.0
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
}}}

As an alternative you can submit the ''rev'' parameter with the etag header field ''If-Match''.
{{{
DELETE /somedatabase/some_doc HTTP/1.0
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
If-Match: "1582603387"
}}}

And the response:

{{{
HTTP/1.1 200 OK
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{"ok":true,"rev":"2839830636"}
}}}

== Attachments ==

Documents can have attachments just like email. There are two ways to use attachments. The first one is inline with your document and it described first. The second one is a separate REST API for attachments that is described a little further down.

=== Inline Attachments ===
On creation, attachments go into a special ''_attachments'' attribute of the document. They are encoded in a JSON structure that holds the name, the content_type and the base64 encoded data of an attachment. A document can have any number of attachments.

When retrieving documents, the attachment's actual data is not included, only the metadata. The actual data has to be fetched separately, using a special URI.

Creating a document with an attachment:

{{{
{
  "_id":"attachment_doc",
  "_attachments":
  {
    "foo.txt":
    {
      "content_type":"text\/plain",
      "data": "VGhpcyBpcyBhIGJhc2U2NCBlbmNvZGVkIHRleHQ="
    }
  }
}
}}}

Please note that any base64 data you send has to be on '''a single line of characters''', so pre-process your data to remove any carriage returns and newlines.

Requesting said document:

{{{
GET /database/attachment_doc
}}}

CouchDB replies:

{{{
{
  "_id":"attachment_doc",
  "_rev":1589456116,
  "_attachments":
  {
    "foo.txt":
    {
      "stub":true,
      "content_type":"text\/plain",
      "length":29
    }
  }
}
}}}

Note that the ''"stub":true'' attribute denotes that this is not the complete attachment. Also, note the length attribute added automatically.

Requesting the attachment:

{{{
GET /database/attachment_doc/foo.txt
}}}

CouchDB returns:

{{{
This is a base64 encoded text
}}}

Automatically decoded!

=== Multiple Attachments ===

Creating a document with an attachment:

{{{
{
  "_id":"attachment_doc",
  "_attachments":
  {
    "foo.txt":
    {
      "content_type":"text\/plain",
      "data": "VGhpcyBpcyBhIGJhc2U2NCBlbmNvZGVkIHRleHQ="
    },

   "bar.txt":
    {
      "content_type":"text\/plain",
      "data": "VGhpcyBpcyBhIGJhc2U2NCBlbmNvZGVkIHRleHQ="
    }
  }
}
}}}


=== Standalone Attachments ===
CouchDB allows to create, change and delete attachments without touching the actual document. As a bonus feature, you do not have to base64 encode your data. This can significantly speed up requests since CouchDB and your client do not have to do the base64 conversion.

You need to specify a MIME type using the Content-Type header. CouchDB will serve the attachment with the specified Content-Type when asked.

To create an attachment:
{{{
PUT somedatabase/document/attachment?rev=123 HTTP/1.0
Content-Length: 245
Content-Type: image/jpeg

<JPEG data>
}}}

CouchDB replies:
{{{
{"ok": true, "id": "document", "rev": "765B7D1C"}
}}}

Note that you can do this on a non-existing document. The document and attachment will be created implicitly for you. A revision id must not be specified in this case.

To change an attachment:
{{{
PUT somedatabase/document/attachment?rev=765B7D1C HTTP/1.0
Content-Length: 245
Content-Type: image/jpeg

<JPEG data>
}}}

CouchDB replies:
{{{
{"ok": true, "id": "document", "rev": "766FC88G"}
}}}

To delete an attachment:

{{{
DELETE somedatabase/document/attachment?rev=765B7D1C HTTP/1.0
}}}

CouchDB replies:
{{{
{"ok":true,"id":"document","rev":"519558700"}
}}}

To retrieve an attachment:

{{{
GET somedatabase/document/attachment HTTP/1.0
}}}

CouchDB replies
{{{
Content-Type:image/jpeg

<JPEG data>
}}}


== ETags/Caching ==

CouchDB sends an ''ETag'' Header for document requests. The ETag Header is simply the document's revision in quotes.

For example, a ''GET'' request:

{{{
GET /database/123182719287
}}}

Results in a reply with the following headers:

{{{
cache-control: no-cache,
pragma: no-cache
expires: Tue, 13 Nov 2007 23:09:50 GMT
transfer-encoding: chunked
content-type: text/plain;charset=utf-8
etag: "615790463"
}}}

''POST'' requests also return an ''ETag'' header for either newly created or updated documents.