You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Alexandre Rafalovitch (JIRA)" <ji...@apache.org> on 2014/11/19 02:44:34 UTC

[jira] [Commented] (SOLR-6633) let /update/json/docs store the source json as well

    [ https://issues.apache.org/jira/browse/SOLR-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14217243#comment-14217243 ] 

Alexandre Rafalovitch commented on SOLR-6633:
---------------------------------------------

This is truly just storing original document, right? And only returning the whole thing as well?

Because, in Elasticsearch, the *_src* field is actually used as source for several operations. For example, it is as a source for dynamic update as - by default - fields are not stored individually. And, I think, *_src* field also gets re-written/re-created on update, again because it is actually used as a source of truth.

The second issue I wanted to raise is how this will interplay with UpdateRequestProcessors (ES does not really have those). I guess URPs will apply after the content of the field, so the actual fields may look quite different from what's in the *_src*.

Finally, I am not clear on what this really means: ??all fields go into the 'df'?? . Do we mean, there is a magic copyField or something?

I think we need a bit more specific use-case here, then just an implementation/configuration. Especially, since a similar-but-different implementation in Elasticsearch does not fully match Solr's setup. 

> let /update/json/docs store the source json as well
> ---------------------------------------------------
>
>                 Key: SOLR-6633
>                 URL: https://issues.apache.org/jira/browse/SOLR-6633
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Noble Paul
>            Assignee: Noble Paul
>              Labels: EaseOfUse
>             Fix For: 5.0, Trunk
>
>         Attachments: SOLR-6633.patch, SOLR-6633.patch
>
>
> it is a common requirement to store the entire JSON as a field in Solr. 
> we can have a extra param srcField=field_name to specify the field name
> the /update/json/docs is only useful when all the json fields are predefined or in schemaless mode.
> The better option would be to store the content in a store only field and index the data in another field in other modes
> the relevant section in solrconfig.xml
> {code:xml}
>  <initParams path="/update/json/docs">
>     <lst name="defaults">
>       <!--this ensures that the entire json doc will be stored verbatim into one field-->
>       <str name="srcField">_src</str>
>       <!--This means a the uniqueKeyField will be extracted from the fields and
>        all fields go into the 'df' field. In this config df is already configured to be 'text'
>         -->
>       <str name="mapUniqueKeyOnly">true</str>
>     </lst>
>   </initParams>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org