You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2013/04/04 16:28:31 UTC

[Solr Wiki] Update of "UniqueKey" by ErickErickson

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "UniqueKey" page has been changed by ErickErickson:
http://wiki.apache.org/solr/UniqueKey?action=diff&rev1=11&rev2=12

   * UUID data generated from data in the document
  
  == Text field in the document ==
-  . In the blog RSS example above, the URL of each article. The field must be single-valued.
+  * In the blog RSS example above, the URL of each article. The field must be single-valued. 
+  * It is '''strongly''' advised to use one of the un-analyzed types (e.g. string) for textual unique keys. While using a solr.TextField with analysis does not produce errors, it also won't do what you expect, namely use the output from the analysis chain as the unique key. The raw input before analysis is ''still'' used which leads to duplicate documents (e.g. docs with unique keys of 'id1' and 'ID1' will be two unique docs even if you have a LowercaseFilter in an analysis chain for the unique key). Any normalization of the unique key should be done on the client side before ingestion.
  
  == UUID techniques ==
   . UUID is short for Universal Unique IDentifier. The UUID standard [[http://www.ietf.org/rfc/rfc4122.txt|RFC-4122]] includes several types of UUID with different input formats. There is a UUID field type (called {{{UUIDField}}}) in Solr 1.4 which implements version 4. Fields are defined in the schema.xml file with: