You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Guilleret Florian <gu...@gmail.com> on 2017/06/23 09:26:48 UTC

Query Partial Matching on auto schema

Hi,

I use SOLR 5.2.1 with automatic schema.

But when try a query with a partial word, Solr didn't found anything.

Exemple :

I request a query :
   NHLDO

Solr return nothing but there is a document with name NHLDO457

If i request this query :
NHLDO457

Solr return me the document.


So how i can configure SOLR to retrieve document even with partial word in
query with auto schema ?

Kind Regards
Guilleret Florian

Re: Query Partial Matching on auto schema

Posted by Erick Erickson <er...@gmail.com>.
You can
1> use the "managed schema API", see:
https://lucene.apache.org/solr/guide/6_6/schema-api.html#schema-api
or
2> switch to "classic schema", see:
https://lucene.apache.org/solr/guide/6_6/schema-factory-definition-in-solrconfig.html
or
3> You can actually edit that file if you do it offline. The problem
is that if you do hand-edits _and_ use the managed schema to
programmatically change the file your hand-edits will be overwritten.
Safest is to shut all your solr nodes down, hand edit the schema file,
and then start them all backup.

Best,
Erick

On Thu, Aug 17, 2017 at 1:00 AM, Guilleret Florian
<gu...@gmail.com> wrote:
> Thanks you for your answer.
>
> So i saw that in solrconfig.xml i got :
>
>   <schemaFactory class="ManagedIndexSchemaFactory">
>     <bool name="mutable">true</bool>
>     <str name="managedSchemaResourceName">managed-schema</str>
>   </schemaFactory>
>
>
> And in file managed-schema i got all the schema :
>
> <!-- Solr managed schema - automatically generated - DO NOT EDIT -->
>
>
> But solr tell to not edit the managed-schema. So where do i need to put
> this :
>
> <analyzer>
>   <tokenizer class="solr.StandardTokenizerFactory"/>
>   <filter class="solr.EdgeNGramFilterFactory"/></analyzer>
>
>
>  ?
>
> Guilleret Florian <http://www.girodmedical.com/>
> Tel : +33 6 21 28 43 06
>
> 2017-06-23 16:48 GMT+02:00 Erick Erickson <er...@gmail.com>:
>
>> I simply do not recommend going to production with schemaless. That
>> mechanism must make certain assumptions about the data and simply
>> cannot anticipate all the types of searching you need to do.
>>
>> As Alessandro says, you can define whatever you want "by hand" and
>> still have schemaless add input. It becomes a matter of preference,
>> would you rather have documents with fields that haven't been seen
>> before fail immediately? Or would you rather have them get new fields
>> that you then have to discover? I prefer the former.
>>
>> Best,
>> Erick
>>
>> On Fri, Jun 23, 2017 at 3:41 AM, alessandro.benedetti
>> <a....@sease.io> wrote:
>> > Quoting the official solr documentation :
>> > " You Can Still Be Explicit
>> > Even if you want to use schemaless mode for most fields, you can still
>> use
>> > the Schema API to pre-emptively create some fields, with explicit types,
>> > before you index documents that use them.
>> >
>> > Internally, the Schema API and the Schemaless Update Processors both use
>> the
>> > same Managed Schema functionality."
>> >
>> > Even using schemaless you can use the managed schema APi to define your
>> own
>> > field types and fields.
>> >
>> > For more info [1]
>> >
>> > [1]
>> > https://lucene.apache.org/solr/guide/6_6/schemaless-
>> mode.html#SchemalessMode-EnableManagedSchema
>> >
>> >
>> >
>> > -----
>> > ---------------
>> > Alessandro Benedetti
>> > Search Consultant, R&D Software Engineer, Director
>> > Sease Ltd. - www.sease.io
>> > --
>> > View this message in context: http://lucene.472066.n3.
>> nabble.com/Query-Partial-Matching-on-auto-schema-tp4342502p4342509.html
>> > Sent from the Solr - User mailing list archive at Nabble.com.
>>

Re: Query Partial Matching on auto schema

Posted by Guilleret Florian <gu...@gmail.com>.
Thanks you for your answer.

So i saw that in solrconfig.xml i got :

  <schemaFactory class="ManagedIndexSchemaFactory">
    <bool name="mutable">true</bool>
    <str name="managedSchemaResourceName">managed-schema</str>
  </schemaFactory>


And in file managed-schema i got all the schema :

<!-- Solr managed schema - automatically generated - DO NOT EDIT -->


But solr tell to not edit the managed-schema. So where do i need to put
this :

<analyzer>
  <tokenizer class="solr.StandardTokenizerFactory"/>
  <filter class="solr.EdgeNGramFilterFactory"/></analyzer>


 ?

Guilleret Florian <http://www.girodmedical.com/>
Tel : +33 6 21 28 43 06

2017-06-23 16:48 GMT+02:00 Erick Erickson <er...@gmail.com>:

> I simply do not recommend going to production with schemaless. That
> mechanism must make certain assumptions about the data and simply
> cannot anticipate all the types of searching you need to do.
>
> As Alessandro says, you can define whatever you want "by hand" and
> still have schemaless add input. It becomes a matter of preference,
> would you rather have documents with fields that haven't been seen
> before fail immediately? Or would you rather have them get new fields
> that you then have to discover? I prefer the former.
>
> Best,
> Erick
>
> On Fri, Jun 23, 2017 at 3:41 AM, alessandro.benedetti
> <a....@sease.io> wrote:
> > Quoting the official solr documentation :
> > " You Can Still Be Explicit
> > Even if you want to use schemaless mode for most fields, you can still
> use
> > the Schema API to pre-emptively create some fields, with explicit types,
> > before you index documents that use them.
> >
> > Internally, the Schema API and the Schemaless Update Processors both use
> the
> > same Managed Schema functionality."
> >
> > Even using schemaless you can use the managed schema APi to define your
> own
> > field types and fields.
> >
> > For more info [1]
> >
> > [1]
> > https://lucene.apache.org/solr/guide/6_6/schemaless-
> mode.html#SchemalessMode-EnableManagedSchema
> >
> >
> >
> > -----
> > ---------------
> > Alessandro Benedetti
> > Search Consultant, R&D Software Engineer, Director
> > Sease Ltd. - www.sease.io
> > --
> > View this message in context: http://lucene.472066.n3.
> nabble.com/Query-Partial-Matching-on-auto-schema-tp4342502p4342509.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Query Partial Matching on auto schema

Posted by Erick Erickson <er...@gmail.com>.
I simply do not recommend going to production with schemaless. That
mechanism must make certain assumptions about the data and simply
cannot anticipate all the types of searching you need to do.

As Alessandro says, you can define whatever you want "by hand" and
still have schemaless add input. It becomes a matter of preference,
would you rather have documents with fields that haven't been seen
before fail immediately? Or would you rather have them get new fields
that you then have to discover? I prefer the former.

Best,
Erick

On Fri, Jun 23, 2017 at 3:41 AM, alessandro.benedetti
<a....@sease.io> wrote:
> Quoting the official solr documentation :
> " You Can Still Be Explicit
> Even if you want to use schemaless mode for most fields, you can still use
> the Schema API to pre-emptively create some fields, with explicit types,
> before you index documents that use them.
>
> Internally, the Schema API and the Schemaless Update Processors both use the
> same Managed Schema functionality."
>
> Even using schemaless you can use the managed schema APi to define your own
> field types and fields.
>
> For more info [1]
>
> [1]
> https://lucene.apache.org/solr/guide/6_6/schemaless-mode.html#SchemalessMode-EnableManagedSchema
>
>
>
> -----
> ---------------
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Query-Partial-Matching-on-auto-schema-tp4342502p4342509.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Query Partial Matching on auto schema

Posted by "alessandro.benedetti" <a....@sease.io>.
Quoting the official solr documentation : 
" You Can Still Be Explicit
Even if you want to use schemaless mode for most fields, you can still use
the Schema API to pre-emptively create some fields, with explicit types,
before you index documents that use them.

Internally, the Schema API and the Schemaless Update Processors both use the
same Managed Schema functionality."

Even using schemaless you can use the managed schema APi to define your own
field types and fields.

For more info [1]

[1]
https://lucene.apache.org/solr/guide/6_6/schemaless-mode.html#SchemalessMode-EnableManagedSchema



-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context: http://lucene.472066.n3.nabble.com/Query-Partial-Matching-on-auto-schema-tp4342502p4342509.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Query Partial Matching on auto schema

Posted by Guilleret Florian <gu...@gmail.com>.
Yes I mean schemaless.

Then with schemaless its impossible to have what I expect ?

Guilleret Florian <http://www.girodmedical.com/>
Tel : +33 6 21 28 43 06

2017-06-23 12:26 GMT+02:00 alessandro.benedetti <a....@sease.io>:

> With automatic schema do you mean schemaless ?
> You will need to define a schema managed/old legacy style as you prefer.
>
> Then you define a field type that suites your needs ( for example with an
> edge n-gram token filter[1] ).
> And you assign that field type to a specific field.
>
> Than in your request handler/ when you build your query just use that field
> to search.
>
> Regards
>
> [1]
> https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html#
> FilterDescriptions-EdgeN-GramFilter
>
>
>
> -----
> ---------------
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Query-Partial-Matching-on-auto-schema-tp4342502p4342506.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Query Partial Matching on auto schema

Posted by "alessandro.benedetti" <a....@sease.io>.
With automatic schema do you mean schemaless ?
You will need to define a schema managed/old legacy style as you prefer.

Then you define a field type that suites your needs ( for example with an
edge n-gram token filter[1] ).
And you assign that field type to a specific field.

Than in your request handler/ when you build your query just use that field
to search.

Regards

[1]
https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html#FilterDescriptions-EdgeN-GramFilter



-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context: http://lucene.472066.n3.nabble.com/Query-Partial-Matching-on-auto-schema-tp4342502p4342506.html
Sent from the Solr - User mailing list archive at Nabble.com.