You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@couchdb.apache.org by Apache Wiki <wi...@apache.org> on 2010/05/02 20:07:55 UTC

[Couchdb Wiki] Update of "View_Snippets" by SebastianCohnen

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The "View_Snippets" page has been changed by SebastianCohnen.
The comment on this change is: added first level heading; added TOC; removed unnecessary anchors.
http://wiki.apache.org/couchdb/View_Snippets?action=diff&rev1=35&rev2=36

--------------------------------------------------

+ = View Snippets =
+ <<TableOfContents()>>
+ 
  This page collects code snippets to be used in your [[Views]]. They are mainly meant to help get your head around the map/reduce approach to accessing database content. Keep in mind that the the Futon web client silently adds group=true to your views.
  
-   * [[#common_mistakes|Common mistakes]]
-   * [[#get_doc_id|Get docs with a particular user id ]]
-   * [[#get_doc_with_attachment|Get all documents which have an attachment ]]
-   * [[#count_doc_with_attachment|Count documents with and without an attachment]]
-   * [[#list_unique_values|Generating a list of unique values]]
-   * [[#top_n_tags|Retrieve the top N tags]]
-   * [[#aggregate_sum|Joining an aggregate sum along with related data ]]
-   * [[#standard_deviation|Computing the standard deviation]]
-   * [[#summary_stats|Computing simple summary statistics (min,max,mean,standard deviation) ]]
-   * [[#interactive_couchdb|Interactive CouchDB Tutorial]]
-   * [[#documents_without_a_field|Retrieving documents without a certain field]]
-   * [[#geospatial_indexes|Using views to search for sort documents geographically]]
  
- <<Anchor(common_mistakes)>>
  == Common mistakes ==
  
  When creating a reduce function, a re-reduce should behave in the same way as the regular reduce. The reason is that CouchDB doesn't necessarily call re-reduce on your map results.
  
  Think about it this way: If you have a bunch of values V1 V2 V3 for key K, then you can get the combined result either by calling reduce([K,K,K],[V1,V2,V3],0) or by re-reducing the individual results: reduce(null,[R1,R2,R3],1). This depends on what your view results look like internally.
  
- <<Anchor(get_doc_id)>>
+ 
  == Get docs with a particular user id ==
  
  {{{
@@ -35, +25 @@

  
  Then query with key=USER_ID to get all the rows that match that user.
  
- <<Anchor(get_doc_with_attachment)>>
+ 
  == Get all documents which have an attachment ==
  
  This lists only the documents which have an attachment.
@@ -50, +40 @@

  
  In SQL this would be something like {{{SELECT id FROM table WHERE attachment IS NOT NULL}}}.
  
- <<Anchor(count_doc_with_attachment)>>
+ 
  == Count documents with and without an attachment ==
  
  Call this with ''group=true'' or you only get the combined number of documents with and without attachments.
@@ -80, +70 @@

  
  In SQL this would be something along the lines of {{{SELECT num_attachments FROM table GROUP BY num_attachments}}} (but this would give extra output for rows containing more than one attachment).
  
- <<Anchor(list_unique_values)>>
+ 
  == Generating a list of unique values ==
  
  Here we use the fact that the key for a view result can be an array. Suppose you have a map that generates (key, value) pairs with many duplicates and you want to remove the duplicates. To do so, use ([key, value], null) as the map output.
@@ -124, +114 @@

  If you then want to know the total count for each parent, you can use the ''group_level'' view parameter:
  ''startkey=[''''''"thisparent"]&endkey=["thisparent",{}]&inclusive_end=false&group_level=1''
  
- <<Anchor(top_n_tags)>>
+ 
  == Retrieve the top N tags. ==
  
  This snippet assumes your docs have a top level tags element that is an array of strings, theoretically it'd work with an array of anything, but it hasn't been tested as such.
@@ -223, +213 @@

  
  When querying this reduce you should not use the `group` or `group_level` query string parameters. The returned reduce value will be an object with the top `MAX` tag: count pairs.
  
- <<Anchor(aggregate_sum)>>
+ 
  == Joining an aggregate sum along with related data ==
  
  Here is a modified example from the [[View_collation|View collation]] page.  Note that `group_level` needs to be set to `1` for it to return a meaningful `customer_details`.
@@ -261, +251 @@

  }}}
  
  
- <<Anchor(standard_deviation)>>
  == Computing the standard deviation ==
  This example is from the couchdb test-suite. It is '''much''' easier and less complex then following example ([[#summary_stats|Computing simple summary statistics (min,max,mean,standard deviation)]]) although it does not calculate min,max and mean (but this should be an easy exercise).
  
@@ -311, +300 @@

  }}}
  
  
- <<Anchor(summary_stats)>>
+ 
  == Computing simple summary statistics (min,max,mean,standard deviation)  ==
  
  This implementation of standard deviation is more complex than the above algorithm, called the "textbook one-pass algorithm" by Chan, Golub, and Le``Veque.  While it is mathematically equivalent to the standard two-pass computation of standard deviation, it can be numerically unstable under certain conditions.  Specifically, if the square of the sums and  the sum of the squares terms are large, then they will be computed with some rounding error.  If the variance of the data set is small, then subtracting those two large numbers (which have been rounded off slightly) might wipe out the computation of the variance.  See http://www.jstor.org/stable/2683386, http://people.xiph.org/~tterribe/notes/homs.html, and the wikipedia description of Knuth's algorithm http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance.
@@ -706, +695 @@

  
  For example: you can now query your view and retrieve all documents that do not contain the field `role` (view/NAME/?key="role").
  
- <<Anchor(geospatial_indexes)>>
+ 
  == Using views to search for sort documents geographically ==
  
  If you use latitude/longitude information in your documents, it's not very easy to sort on proximity from a given point using the normal approach (of using a key of [<latitude>, <longitude>]). This happens because they're on different axes, which doesn't map well onto CouchDB's treatment of the index sorting -- which is a linear sort. However, using a [[http://en.wikipedia.org/wiki/Geohash|geohash]] may solve this, by letting you convert the coordinates of a location into a string that sorts well (e.g., locations that are close share a common prefix).