You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2012/05/23 02:32:40 UTC

[jira] [Updated] (SOLR-2796) AddUpdateCommand.getIndexedId doesn't work with schema configured defaults/copyField - UUIDField/copyField can not be used as uniqueKey field

     [ https://issues.apache.org/jira/browse/SOLR-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated SOLR-2796:
---------------------------

    Description: 
in Solr 1.4, and the HEAD of the 3x branch, the UUIDField can be used as the uniqueKey field even if documents do not specify a value by taking advantage of the {{default="NEW"}} feature of UUIDField.

Similarly, a copyField can be used to populate the uniqueKey field with data from some field with another name -- multiple copyFields can even be used if there is no overlap (ie: if you have two differnet types of documents with no overlap in their id space, you can copy from companyId->id and from productId->id and use "id" as your uniqueKey field in solr)

Neither of these approaches work in Solr trunk because of how {{AddUpdateCommand.getIndexedId}} is currently used by the DirectUpdateHander2 (see [r1152500|http://svn.apache.org/viewvc?view=revision&revision=1152500]).

  was:in Solr 1.4, and the HEAD of the 3x branch, the UUIDField can be used as the uniqueKey field even if documents do not specify a value by taking advantage of the {{default="NEW"}} feature of UUIDField.  but something has changed in trunk to break this behavior.

       Priority: Blocker  (was: Major)
        Summary: AddUpdateCommand.getIndexedId doesn't work with schema configured defaults/copyField - UUIDField/copyField can not be used as uniqueKey field  (was: AddUpdateCommand.getIndexedId doesn't work with schema configured defaults - UUIDField can not be used as uniqueKey field)

Updating descriptiong after looking into it a bit more.

Even if we reverted some of the logic in {{AddUpdateCommand.getIndexedId}} to work the way {{DirectUpdateHandler.getIndexedId(Document)}} did in the 3x branch, this defered/delayed creating of the uniqueKey field just fundamentally can't work in SolrCloud because we have to be able to determine the value for the uniqueKey field well before any schema defaults/copyFields so that the distrib processor knows which shard to forward to.

I think we should bite the bullet and say "Starting with Solr 4, schema defaults and copyFields can not be used to populate the uniqueKey field" (we can even enforce this when parsing the schema - error if the uniqueKey field has a declared default or is the dest of a copyField) and provide UpdateProcessor alternatives for the behaviors that were previously possible with schema options...

 * FielCopyUpdateProcessor - SOLR-2599
 * UUIDFieldUpdateProcessor - generates a new UUID for a configured field name if it doesn't already have a value in it
 * TimestampUpdateProcessor - generates a new Date for a configured field name if it doesn't already have a value in it (unlikely anyone is useing a DateField as their uniqueKey, but it's possible and fairly easy to offer this just in case)

thoughts?

                
> AddUpdateCommand.getIndexedId doesn't work with schema configured defaults/copyField - UUIDField/copyField can not be used as uniqueKey field
> ---------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2796
>                 URL: https://issues.apache.org/jira/browse/SOLR-2796
>             Project: Solr
>          Issue Type: Bug
>          Components: update
>    Affects Versions: 4.0
>            Reporter: Hoss Man
>            Priority: Blocker
>             Fix For: 4.0
>
>         Attachments: SOLR-2796.patch
>
>
> in Solr 1.4, and the HEAD of the 3x branch, the UUIDField can be used as the uniqueKey field even if documents do not specify a value by taking advantage of the {{default="NEW"}} feature of UUIDField.
> Similarly, a copyField can be used to populate the uniqueKey field with data from some field with another name -- multiple copyFields can even be used if there is no overlap (ie: if you have two differnet types of documents with no overlap in their id space, you can copy from companyId->id and from productId->id and use "id" as your uniqueKey field in solr)
> Neither of these approaches work in Solr trunk because of how {{AddUpdateCommand.getIndexedId}} is currently used by the DirectUpdateHander2 (see [r1152500|http://svn.apache.org/viewvc?view=revision&revision=1152500]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org