You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2010/03/05 21:00:39 UTC

[Solr Wiki] Update of "UniqueKey" by ChrisHarris

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "UniqueKey" page has been changed by ChrisHarris.
http://wiki.apache.org/solr/UniqueKey?action=diff&rev1=7&rev2=8

--------------------------------------------------

    Use cases change, and you may want to change the identity of the documents. For example, an RSS feed for videos might change to give different entries for the same video in different sizes. You may decide that the different entries are really the same document.
    There is a saying in database design:''data sticks where it lands''. Once you store data in some format and container, it is very hard to change this decision. By adding a layer of indirection in the SOLR schema's identity, you give yourself the ability to change the innate identity of the document.
   * Multiple queries about the same document, with document id saved for future reference.
-  * Delete documents.
+  * Delete documents. (Though you can also delete documents matching a query, rather than by unique key value.)
+  * If you use DistributedSearch, you need a unique key. As an added benefit, if the same document (determined by unique key) ends up indexed in multiple shards, then only one of the docs will get returned in user's query results.
+ 
  == Use cases which require a unique key generated from data in the document ==
   * Allow different database systems to create identity keys that work in other systems.
    The documents may come from multiple sources, and be stored in multiple places. There may not be one convenient place in the indexing path to create a unique id. The different sources will need to separately implement the same algorithm. The key should be a short unique string (see UUID below).