You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Keith Dopson <kd...@ritternet.com> on 2018/03/07 05:43:27 UTC

What is creating certain fields?

My default query produces this:

|  {
         "id":"44419",
         "date":["11/13/17 13:18"],
         "url":["http://www.someurl.com"],
         "title":["some title"],
         "content":["some indexed content..........."],
         "date_str":["11/13/17 13:18"],
         "url_str":["http://www.someurl.com"],
         "title_str":["some title"],
         "_version_":1594211356390719488,
         "content_str":["some indexed content.........."]
},


In my managed_schema file, I only have five populated fields,

    <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />

    <field name="date"    type="text_general" indexed="false" stored="true"/>
    <field name="url"     type="text_general" indexed="false" stored="true"/>
    <field name="title"   type="text_general" indexed="true"  stored="true"/>
    <field name="content" type="text_general" indexed="true"  stored="true"/>

While other fields are declared, none of them are populated by my "post" command.

My question is "Where are the xxxxx_str fields coming from?
I.e., what is producing the
|
||"date_str":["...
"url_str":["...
"title_str":["...
"content_str":["...|

entries?

Thanks in advance.
|



Re: What is creating certain fields?

Posted by Cassandra Targett <ca...@gmail.com>.
I'll guess you're using Solr 7.x and those fields in your schema were
created automatically?

As of Solr 7.0, the schemaless mode field guessing added a copyField rule
for any field that's guessed to be text to copy the first 256 characters to
a multivalued string field. The way it works is a field is created with the
type "text_general", and a copyField is then automatically created with the
dynamic field rule "*_str" to create the multivalued string field.

This came from https://issues.apache.org/jira/browse/SOLR-9526.

You can prohibit the behavior if you want to by removing the copyField rule
section. See the docs for where in the solrconfig.xml you will want to
edit:
https://lucene.apache.org/solr/guide/schemaless-mode.html#enable-field-class-guessing
.

Cassandra

On Wed, Mar 7, 2018 at 9:46 AM, Erick Erickson <er...@gmail.com>
wrote:

> Maybe  a copyField is realizing the dynamic fields?
>
>
> On Wed, Mar 7, 2018 at 7:43 AM, David Hastings
> <ha...@gmail.com> wrote:
> > those are dynamic fields.
> >
> >   <dynamicField name="*_str" type="strings" docValues="true"
> > indexed="false" stored="false"/>
> >
> >
> > On Wed, Mar 7, 2018 at 12:43 AM, Keith Dopson <kd...@ritternet.com>
> wrote:
> >
> >> My default query produces this:
> >>
> >> |  {
> >>         "id":"44419",
> >>         "date":["11/13/17 13:18"],
> >>         "url":["http://www.someurl.com"],
> >>         "title":["some title"],
> >>         "content":["some indexed content..........."],
> >>         "date_str":["11/13/17 13:18"],
> >>         "url_str":["http://www.someurl.com"],
> >>         "title_str":["some title"],
> >>         "_version_":1594211356390719488,
> >>         "content_str":["some indexed content.........."]
> >> },
> >>
> >>
> >> In my managed_schema file, I only have five populated fields,
> >>
> >>    <field name="id" type="string" indexed="true" stored="true"
> >> required="true" multiValued="false" />
> >>
> >>    <field name="date"    type="text_general" indexed="false"
> >> stored="true"/>
> >>    <field name="url"     type="text_general" indexed="false"
> >> stored="true"/>
> >>    <field name="title"   type="text_general" indexed="true"
> >> stored="true"/>
> >>    <field name="content" type="text_general" indexed="true"
> >> stored="true"/>
> >>
> >> While other fields are declared, none of them are populated by my "post"
> >> command.
> >>
> >> My question is "Where are the xxxxx_str fields coming from?
> >> I.e., what is producing the
> >> |
> >> ||"date_str":["...
> >> "url_str":["...
> >> "title_str":["...
> >> "content_str":["...|
> >>
> >> entries?
> >>
> >> Thanks in advance.
> >> |
> >>
> >>
> >>
>

Re: What is creating certain fields?

Posted by Erick Erickson <er...@gmail.com>.
Maybe  a copyField is realizing the dynamic fields?


On Wed, Mar 7, 2018 at 7:43 AM, David Hastings
<ha...@gmail.com> wrote:
> those are dynamic fields.
>
>   <dynamicField name="*_str" type="strings" docValues="true"
> indexed="false" stored="false"/>
>
>
> On Wed, Mar 7, 2018 at 12:43 AM, Keith Dopson <kd...@ritternet.com> wrote:
>
>> My default query produces this:
>>
>> |  {
>>         "id":"44419",
>>         "date":["11/13/17 13:18"],
>>         "url":["http://www.someurl.com"],
>>         "title":["some title"],
>>         "content":["some indexed content..........."],
>>         "date_str":["11/13/17 13:18"],
>>         "url_str":["http://www.someurl.com"],
>>         "title_str":["some title"],
>>         "_version_":1594211356390719488,
>>         "content_str":["some indexed content.........."]
>> },
>>
>>
>> In my managed_schema file, I only have five populated fields,
>>
>>    <field name="id" type="string" indexed="true" stored="true"
>> required="true" multiValued="false" />
>>
>>    <field name="date"    type="text_general" indexed="false"
>> stored="true"/>
>>    <field name="url"     type="text_general" indexed="false"
>> stored="true"/>
>>    <field name="title"   type="text_general" indexed="true"
>> stored="true"/>
>>    <field name="content" type="text_general" indexed="true"
>> stored="true"/>
>>
>> While other fields are declared, none of them are populated by my "post"
>> command.
>>
>> My question is "Where are the xxxxx_str fields coming from?
>> I.e., what is producing the
>> |
>> ||"date_str":["...
>> "url_str":["...
>> "title_str":["...
>> "content_str":["...|
>>
>> entries?
>>
>> Thanks in advance.
>> |
>>
>>
>>

Re: What is creating certain fields?

Posted by David Hastings <ha...@gmail.com>.
those are dynamic fields.

  <dynamicField name="*_str" type="strings" docValues="true"
indexed="false" stored="false"/>


On Wed, Mar 7, 2018 at 12:43 AM, Keith Dopson <kd...@ritternet.com> wrote:

> My default query produces this:
>
> |  {
>         "id":"44419",
>         "date":["11/13/17 13:18"],
>         "url":["http://www.someurl.com"],
>         "title":["some title"],
>         "content":["some indexed content..........."],
>         "date_str":["11/13/17 13:18"],
>         "url_str":["http://www.someurl.com"],
>         "title_str":["some title"],
>         "_version_":1594211356390719488,
>         "content_str":["some indexed content.........."]
> },
>
>
> In my managed_schema file, I only have five populated fields,
>
>    <field name="id" type="string" indexed="true" stored="true"
> required="true" multiValued="false" />
>
>    <field name="date"    type="text_general" indexed="false"
> stored="true"/>
>    <field name="url"     type="text_general" indexed="false"
> stored="true"/>
>    <field name="title"   type="text_general" indexed="true"
> stored="true"/>
>    <field name="content" type="text_general" indexed="true"
> stored="true"/>
>
> While other fields are declared, none of them are populated by my "post"
> command.
>
> My question is "Where are the xxxxx_str fields coming from?
> I.e., what is producing the
> |
> ||"date_str":["...
> "url_str":["...
> "title_str":["...
> "content_str":["...|
>
> entries?
>
> Thanks in advance.
> |
>
>
>