You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Denis WSRosa <de...@gmail.com> on 2011/08/17 22:18:04 UTC
Solr Accent Insensitive and sensitive search
Hi all!
I have configured my schema to use the solr.ASCIIFoldingFilterFactory
filter, this way I'm able to search a word like "ferias" and get "férias",
but when I try to search the exact word "férias" I got nothing as result.
Is there a way to configure both cases in the search?
Best Regards!
--
Denis Wilson Souza Rosa
----------------------------------------------------
Systems Architect
mobile: +55 11 8112 8284
email: deniswsrosa@gmail.com / deniswsrosa@hotmail.com
Re: Solr Accent Insensitive and sensitive search
Posted by Erick Erickson <er...@gmail.com>.
Well, we can't tell. Because you haven't identified the field you are
working with.
So, we need two additional pieces of information:
the query you use that works
the query you use that doesn't work
And attach &debugQuery=on to both of them and post the results back please.
But looking at the admin/analysis page with the fields and input in
question may help
you get an idea what's going on. Also, the "full interface" on the
admin page will put the
debug information in a pretty format (make sure to check the "debug" checkbox).
How are you trying to get an "exact match?"
Best
Erick
On Thu, Aug 18, 2011 at 8:26 AM, Denis WSRosa <de...@gmail.com> wrote:
> Hi! Thank you for your response!
>
> here is my full schema:
>
> <?xml version="1.0" encoding="UTF-8" ?>
> <!-- Licensed to the Apache Software Foundation (ASF) under one or more
> contributor
> license agreements. See the NOTICE file distributed with this work for
> additional
> information regarding copyright ownership. The ASF licenses this file to
>
> You under the Apache License, Version 2.0 (the "License"); you may not
> use
> this file except in compliance with the License. You may obtain a copy
> of
> the License at http://www.apache.org/licenses/LICENSE-2.0 Unless
> required
> by applicable law or agreed to in writing, software distributed under
> the
> License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
> CONDITIONS
> OF ANY KIND, either express or implied. See the License for the specific
>
> language governing permissions and limitations under the License. -->
>
> <!-- This is the Solr schema file. This file should be named "schema.xml"
> and should be in the conf directory under the solr home (i.e.
> ./solr/conf/schema.xml
> by default) or located where the classloader for the Solr webapp can
> find
> it. This example schema is the recommended starting point for users. It
> should
> be kept correct and concise, usable out-of-the-box. For more
> information,
> on how to customize this file, please see
> http://wiki.apache.org/solr/SchemaXml
> PERFORMANCE NOTE: this schema includes many optional features and should
>
> not be used for benchmarking. To improve performance one could - set
> stored="false"
> for all fields possible (esp large fields) when you only need to search
> on
> the field but don't need to return the original value. - set
> indexed="false"
> if you don't need to search on the field, but only return the field as a
>
> result of searching on other indexed fields. - remove all unneeded
> copyField
> statements - for best index size and searching performance, set "index"
> to
> false for all general text fields, use copyField to copy them to the
> catchall
> "text" field, and use that for searching. - For maximum indexing
> performance,
> use the StreamingUpdateSolrServer java client. - Remember to run the JVM
>
> in server mode, and use a higher logging level that avoids logging every
>
> request -->
>
> <schema name="example" version="1.4">
>
> <types>
>
> <fieldType name="uuid" class="solr.StrField" multiValued="false" />
> <!-- Not analized field -->
> <fieldType name="string" class="solr.StrField" multiValued="false"
> omitNorms="true" />
>
> <!-- boolean type: "true" or "false" -->
> <fieldType name="boolean" class="solr.BoolField"
> sortMissingLast="true" omitNorms="true" />
> <!--Binary data type. The data should be sent/retrieved in as Base64
> encoded
> Strings -->
> <fieldtype name="binary" class="solr.BinaryField" />
> <!-- Default numeric field types. For faster range queries, consider
> the
> tint/tfloat/tlong/tdouble types. -->
> <fieldType name="int" class="solr.TrieIntField"
> precisionStep="0" omitNorms="true" positionIncrementGap="0" />
> <fieldType name="float" class="solr.TrieFloatField"
> precisionStep="0" omitNorms="true" positionIncrementGap="0" />
> <fieldType name="long" class="solr.TrieLongField"
> precisionStep="0" omitNorms="true" positionIncrementGap="0" />
> <fieldType name="double" class="solr.TrieDoubleField"
> precisionStep="0" omitNorms="true" positionIncrementGap="0" />
> <fieldType name="date" class="solr.DateField"
> sortMissingLast="true" omitNorms="true" />
>
> <!-- Numeric field types that index each value at various levels of
> precision
> to accelerate range queries when the number of values between
> the range endpoints
> is large. See the javadoc for NumericRangeQuery for internal
> implementation
> details. Smaller precisionStep values (specified in bits) will
> lead to more
> tokens indexed per value, slightly larger index size, and faster
> range queries.
> A precisionStep of 0 disables indexing at different precision
> levels. -->
> <fieldType name="tint" class="solr.TrieIntField"
> precisionStep="8" omitNorms="true" positionIncrementGap="0" />
> <fieldType name="tfloat" class="solr.TrieFloatField"
> precisionStep="8" omitNorms="true" positionIncrementGap="0" />
> <fieldType name="tlong" class="solr.TrieLongField"
> precisionStep="8" omitNorms="true" positionIncrementGap="0" />
> <fieldType name="tdouble" class="solr.TrieDoubleField"
> precisionStep="8" omitNorms="true" positionIncrementGap="0" />
> <!-- A Trie based date field for faster date range queries and date
> faceting. -->
> <fieldType name="tdate" class="solr.TrieDateField"
> omitNorms="true" precisionStep="6" positionIncrementGap="0" />
>
> <!-- Key type fields, no filers -->
> <fieldType name="keytype" class="solr.TextField"
> multiValued="false" omitNorms="true">
> <analyzer>
> <tokenizer class="solr.KeywordTokenizerFactory" />
> </analyzer>
> </fieldType>
>
> <!-- A general text field that has reasonable, generic
> cross-language defaults:
> it tokenizes with StandardTokenizer, removes stop words from
> case-insensitive
> "stopwords.txt" (empty by default), and down cases. At query
> time only, it
> also applies synonyms. -->
> <fieldType name="text" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer type="index">
> <tokenizer class="solr.StandardTokenizerFactory" />
> <!-- <filter class="solr.StopFilterFactory"
> ignoreCase="true" words="stopwords.txt"
> enablePositionIncrements="true" /> -->
> <filter class="solr.ASCIIFoldingFilterFactory" />
> <filter class="solr.LowerCaseFilterFactory" />
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.StandardTokenizerFactory" />
> <!-- filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt"
> enablePositionIncrements="true" / -->
> <!--filter class="solr.SynonymFilterFactory"
> synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/ -->
> <filter class="solr.ASCIIFoldingFilterFactory" />
> <filter class="solr.LowerCaseFilterFactory" />
> </analyzer>
> </fieldType>
>
> <!-- lowercases the entire field value, keeping it as a single
> token. -->
> <fieldType name="tags" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer>
> <filter class="solr.ASCIIFoldingFilterFactory" />
> <tokenizer class="solr.PatternTokenizerFactory" pattern=","
> />
> <filter class="solr.LowerCaseFilterFactory" />
> </analyzer>
> </fieldType>
>
> <!-- lowercases the entire field value, keeping it as a single
> token. -->
> <fieldType name="number" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer>
> <tokenizer class="solr.WhitespaceTokenizerFactory" />
> <filter class="solr.TrimFilterFactory" />
> </analyzer>
> </fieldType>
>
> <!-- A general content field used for search. Should be used for
> content
> strings. This sort of field will have the html tags removed -->
> <fieldType name="content" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer>
> <charFilter class="solr.HTMLStripCharFilterFactory" />
> <tokenizer class="solr.StandardTokenizerFactory" />
> <!-- <filter class="solr.StopFilterFactory"
> ignoreCase="true" words="stopwords.txt"
> enablePositionIncrements="true" /> -->
> <!-- <filter class="solr.SynonymFilterFactory"
> synonyms="index_synonyms.txt"
> ignoreCase="true" expand="false"/> -->
> <filter class="solr.ASCIIFoldingFilterFactory" />
> <filter class="solr.LowerCaseFilterFactory" />
> </analyzer>
> </fieldType>
>
> <!-- Just like text except it reverses the characters of each token,
> to
> enable more efficient leading wildcard queries. -->
> <fieldType name="text_rev" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer type="index">
> <tokenizer class="solr.StandardTokenizerFactory" />
> <!-- <filter class="solr.StopFilterFactory"
> ignoreCase="true" words="stopwords.txt"
> enablePositionIncrements="true" /> -->
> <filter class="solr.ASCIIFoldingFilterFactory" />
> <filter class="solr.LowerCaseFilterFactory" />
> <filter class="solr.ReversedWildcardFilterFactory"
> withOriginal="true" maxPosAsterisk="3"
> maxPosQuestion="2"
> maxFractionAsterisk="0.33" />
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.StandardTokenizerFactory" />
> <!-- <filter class="solr.StopFilterFactory"
> ignoreCase="true" words="stopwords.txt"
> enablePositionIncrements="true" /> -->
> <filter class="solr.ASCIIFoldingFilterFactory" />
> <filter class="solr.LowerCaseFilterFactory" />
> </analyzer>
> </fieldType>
>
> <!-- Just like tags except it reverses the characters of each token,
> to
> enable more efficient leading wildcard queries. -->
> <fieldType name="tags_rev" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer type="index">
> <filter class="solr.ASCIIFoldingFilterFactory" />
> <tokenizer class="solr.PatternTokenizerFactory" pattern=","
> />
> <filter class="solr.LowerCaseFilterFactory" />
> <filter class="solr.ReversedWildcardFilterFactory"
> withOriginal="true" maxPosAsterisk="3"
> maxPosQuestion="2"
> maxFractionAsterisk="0.33" />
> </analyzer>
> <analyzer type="query">
> <filter class="solr.ASCIIFoldingFilterFactory" />
> <tokenizer class="solr.PatternTokenizerFactory" pattern=","
> />
> <filter class="solr.LowerCaseFilterFactory" />
> </analyzer>
> </fieldType>
>
> </types>
>
>
> <fields>
>
> <!-- Basic fields -->
> <field name="UUID" type="uuid" indexed="true" stored="false"
> multiValued="false" required="true" />
> <field name="DocumentType" type="keytype" indexed="true"
> stored="false"
> required="true" multiValued="false"/>
> <field name="DocumentLocale" type="string" indexed="true"
> stored="true" required="false" />
> <field name="DocumentId" type="string" indexed="false" stored="true"
> required="true" />
> <field name="DocumentName" type="text" indexed="true" stored="true"
> required="false" />
> <field name="DocumentDisplayName" type="text" indexed="true"
> stored="true" required="true" />
> <field name="DocumentCreateDate" type="text" indexed="false"
> stored="true" required="false" />
> <field name="DocumentLastUpdateDate" type="text" indexed="false"
> stored="true" required="false" />
> <field name="DocumentContent" type="content" indexed="true"
> stored="false" required="false" />
> <field name="DocumentMIME" type="text" indexed="true" stored="true"
> required="false" />
> <field name="DocumentTAGS" type="tags" indexed="true" stored="true"
> required="false" />
> <field name="URL" type="string" indexed="false" stored="true"
> required="false" />
> <field name="DocumentUSER" type="long" indexed="false" stored="true"
> required="false" />
> <field name="DocumentAuthor" type="string" indexed="false"
> stored="true" required="false" />
> <field name="DocumentSpace" type="long" indexed="false"
> stored="true"
> required="false" />
> <field name="DocumentTenant" type="long" indexed="false"
> stored="true"
> required="false" />
> <field name="DocumentDescription" type="text" indexed="false"
> stored="true" required="false" />
> <field name="META.Content-Type" type="string" indexed="false"
> stored="false" required="false" />
> <field name="DELETED" type="string" indexed="true" stored="false"
> required="false" />
>
> <!-- Extra Fields -->
>
> <!-- Indexed general text field -->
> <dynamicField name="*_text_i" type="text" indexed="true"
> stored="false" />
> <!-- Stored general text field -->
> <dynamicField name="*_text_s" type="text" indexed="false"
> stored="true" />
> <!-- Indexed and stored general text field -->
> <dynamicField name="*_text_is" type="text" indexed="true"
> stored="true" />
> <!-- Indexed general number field -->
> <dynamicField name="*_long_i" type="long" indexed="true"
> stored="false" />
> <!-- Stored general number field -->
> <dynamicField name="*_long_s" type="long" indexed="false"
> stored="true" />
> <!-- Indexed and stored general number field -->
> <dynamicField name="*_long_is" type="long" indexed="true"
> stored="true" />
> <!-- Indexed general date field -->
> <dynamicField name="*_date_i" type="long" indexed="true"
> stored="false" />
> <!-- Stored general date field -->
> <dynamicField name="*_date_s" type="long" indexed="false"
> stored="true" />
> <!-- Indexed and stored general date field -->
> <dynamicField name="*_date_is" type="long" indexed="true"
> stored="true" />
> <!-- Indexed general boolean field -->
> <dynamicField name="*_boolean_i" type="boolean" indexed="true"
> stored="false" />
> <!-- Stored general boolean field -->
> <dynamicField name="*_boolean_s" type="boolean" indexed="false"
> stored="true" />
> <!-- Indexed and stored general boolean field -->
> <dynamicField name="*_boolean_is" type="boolean" indexed="true"
> stored="true" />
> <!-- Indexed general mult valuated number fields -->
> <dynamicField name="*_number_i" type="number" indexed="true"
> stored="false" />
> <!-- Stored general mult valuated number fields -->
> <dynamicField name="*_number_s" type="number" indexed="false"
> stored="true" />
> <!-- Indexed and stored general mult valuated number fields -->
> <dynamicField name="*_number_is" type="number" indexed="true"
> stored="true" />
>
> <!-- catchall text field that indexes tokens both normally and in
> reverse
> for efficient leading wildcard queries. -->
> <field name="DocumentDisplayName_rev" type="text_rev" indexed="true"
> stored="false" multiValued="false" />
> <field name="DocumentDescription_rev" type="text_rev" indexed="true"
> stored="false" multiValued="false" />
> <field name="DocumentTAGS_rev" type="tags_rev" indexed="true"
> stored="false" multiValued="true" />
> <field name="DocumentContent_rev" type="text_rev" indexed="true"
> stored="false" multiValued="true" />
>
> <!-- All other fields -->
> <dynamicField name="*" type="string" indexed="true"
> stored="false" />
>
> </fields>
>
> <!-- Field to use to determine and enforce document uniqueness. Unless
> this
> field is marked with required="false", it will be a required field
> -->
> <uniqueKey>UUID</uniqueKey>
>
> <!-- field for the QueryParser to use when an explicit fieldname is
> absent -->
> <defaultSearchField>DocumentDisplayName</defaultSearchField>
>
> <!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
> <solrQueryParser defaultOperator="AND" />
>
> <!-- copyField commands copy one field to another at the time a document
>
> is added to the index. It's used either to index the same field
> differently,
> or to add multiple fields to the same field for easier/faster
> searching. -->
> <copyField source="DocumentDescription" dest="DocumentDescription_rev"
> />
> <copyField source="DocumentDisplayName" dest="DocumentDisplayName_rev"
> />
> <copyField source="DocumentTAGS" dest="DocumentTAGS_rev" />
> <copyField source="DocumentContent" dest="DocumentContent_rev" />
>
> </schema>
>
>
> What I'm doing wrong?
>
>
>
>
> On Wed, Aug 17, 2011 at 5:37 PM, Michael Ryan <mr...@moreover.com> wrote:
>
>> Are you using the same analyzer for both type="query" and type="index"? Can
>> you show us the fieldType from your schema?
>>
>> -Michael
>>
>
>
>
> --
> Denis Wilson Souza Rosa
> ----------------------------------------------------
> Systems Architect
> mobile: +55 11 8112 8284
> email: deniswsrosa@gmail.com / deniswsrosa@hotmail.com
>
Re: Solr Accent Insensitive and sensitive search
Posted by Denis WSRosa <de...@gmail.com>.
Hi! Thank you for your response!
here is my full schema:
<?xml version="1.0" encoding="UTF-8" ?>
<!-- Licensed to the Apache Software Foundation (ASF) under one or more
contributor
license agreements. See the NOTICE file distributed with this work for
additional
information regarding copyright ownership. The ASF licenses this file to
You under the Apache License, Version 2.0 (the "License"); you may not
use
this file except in compliance with the License. You may obtain a copy
of
the License at http://www.apache.org/licenses/LICENSE-2.0 Unless
required
by applicable law or agreed to in writing, software distributed under
the
License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
CONDITIONS
OF ANY KIND, either express or implied. See the License for the specific
language governing permissions and limitations under the License. -->
<!-- This is the Solr schema file. This file should be named "schema.xml"
and should be in the conf directory under the solr home (i.e.
./solr/conf/schema.xml
by default) or located where the classloader for the Solr webapp can
find
it. This example schema is the recommended starting point for users. It
should
be kept correct and concise, usable out-of-the-box. For more
information,
on how to customize this file, please see
http://wiki.apache.org/solr/SchemaXml
PERFORMANCE NOTE: this schema includes many optional features and should
not be used for benchmarking. To improve performance one could - set
stored="false"
for all fields possible (esp large fields) when you only need to search
on
the field but don't need to return the original value. - set
indexed="false"
if you don't need to search on the field, but only return the field as a
result of searching on other indexed fields. - remove all unneeded
copyField
statements - for best index size and searching performance, set "index"
to
false for all general text fields, use copyField to copy them to the
catchall
"text" field, and use that for searching. - For maximum indexing
performance,
use the StreamingUpdateSolrServer java client. - Remember to run the JVM
in server mode, and use a higher logging level that avoids logging every
request -->
<schema name="example" version="1.4">
<types>
<fieldType name="uuid" class="solr.StrField" multiValued="false" />
<!-- Not analized field -->
<fieldType name="string" class="solr.StrField" multiValued="false"
omitNorms="true" />
<!-- boolean type: "true" or "false" -->
<fieldType name="boolean" class="solr.BoolField"
sortMissingLast="true" omitNorms="true" />
<!--Binary data type. The data should be sent/retrieved in as Base64
encoded
Strings -->
<fieldtype name="binary" class="solr.BinaryField" />
<!-- Default numeric field types. For faster range queries, consider
the
tint/tfloat/tlong/tdouble types. -->
<fieldType name="int" class="solr.TrieIntField"
precisionStep="0" omitNorms="true" positionIncrementGap="0" />
<fieldType name="float" class="solr.TrieFloatField"
precisionStep="0" omitNorms="true" positionIncrementGap="0" />
<fieldType name="long" class="solr.TrieLongField"
precisionStep="0" omitNorms="true" positionIncrementGap="0" />
<fieldType name="double" class="solr.TrieDoubleField"
precisionStep="0" omitNorms="true" positionIncrementGap="0" />
<fieldType name="date" class="solr.DateField"
sortMissingLast="true" omitNorms="true" />
<!-- Numeric field types that index each value at various levels of
precision
to accelerate range queries when the number of values between
the range endpoints
is large. See the javadoc for NumericRangeQuery for internal
implementation
details. Smaller precisionStep values (specified in bits) will
lead to more
tokens indexed per value, slightly larger index size, and faster
range queries.
A precisionStep of 0 disables indexing at different precision
levels. -->
<fieldType name="tint" class="solr.TrieIntField"
precisionStep="8" omitNorms="true" positionIncrementGap="0" />
<fieldType name="tfloat" class="solr.TrieFloatField"
precisionStep="8" omitNorms="true" positionIncrementGap="0" />
<fieldType name="tlong" class="solr.TrieLongField"
precisionStep="8" omitNorms="true" positionIncrementGap="0" />
<fieldType name="tdouble" class="solr.TrieDoubleField"
precisionStep="8" omitNorms="true" positionIncrementGap="0" />
<!-- A Trie based date field for faster date range queries and date
faceting. -->
<fieldType name="tdate" class="solr.TrieDateField"
omitNorms="true" precisionStep="6" positionIncrementGap="0" />
<!-- Key type fields, no filers -->
<fieldType name="keytype" class="solr.TextField"
multiValued="false" omitNorms="true">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory" />
</analyzer>
</fieldType>
<!-- A general text field that has reasonable, generic
cross-language defaults:
it tokenizes with StandardTokenizer, removes stop words from
case-insensitive
"stopwords.txt" (empty by default), and down cases. At query
time only, it
also applies synonyms. -->
<fieldType name="text" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory" />
<!-- <filter class="solr.StopFilterFactory"
ignoreCase="true" words="stopwords.txt"
enablePositionIncrements="true" /> -->
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory" />
<!-- filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true" / -->
<!--filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt"
ignoreCase="true" expand="true"/ -->
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
<!-- lowercases the entire field value, keeping it as a single
token. -->
<fieldType name="tags" class="solr.TextField"
positionIncrementGap="100">
<analyzer>
<filter class="solr.ASCIIFoldingFilterFactory" />
<tokenizer class="solr.PatternTokenizerFactory" pattern=","
/>
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
<!-- lowercases the entire field value, keeping it as a single
token. -->
<fieldType name="number" class="solr.TextField"
positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.TrimFilterFactory" />
</analyzer>
</fieldType>
<!-- A general content field used for search. Should be used for
content
strings. This sort of field will have the html tags removed -->
<fieldType name="content" class="solr.TextField"
positionIncrementGap="100">
<analyzer>
<charFilter class="solr.HTMLStripCharFilterFactory" />
<tokenizer class="solr.StandardTokenizerFactory" />
<!-- <filter class="solr.StopFilterFactory"
ignoreCase="true" words="stopwords.txt"
enablePositionIncrements="true" /> -->
<!-- <filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt"
ignoreCase="true" expand="false"/> -->
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
<!-- Just like text except it reverses the characters of each token,
to
enable more efficient leading wildcard queries. -->
<fieldType name="text_rev" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory" />
<!-- <filter class="solr.StopFilterFactory"
ignoreCase="true" words="stopwords.txt"
enablePositionIncrements="true" /> -->
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.ReversedWildcardFilterFactory"
withOriginal="true" maxPosAsterisk="3"
maxPosQuestion="2"
maxFractionAsterisk="0.33" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory" />
<!-- <filter class="solr.StopFilterFactory"
ignoreCase="true" words="stopwords.txt"
enablePositionIncrements="true" /> -->
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
<!-- Just like tags except it reverses the characters of each token,
to
enable more efficient leading wildcard queries. -->
<fieldType name="tags_rev" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<filter class="solr.ASCIIFoldingFilterFactory" />
<tokenizer class="solr.PatternTokenizerFactory" pattern=","
/>
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.ReversedWildcardFilterFactory"
withOriginal="true" maxPosAsterisk="3"
maxPosQuestion="2"
maxFractionAsterisk="0.33" />
</analyzer>
<analyzer type="query">
<filter class="solr.ASCIIFoldingFilterFactory" />
<tokenizer class="solr.PatternTokenizerFactory" pattern=","
/>
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
</types>
<fields>
<!-- Basic fields -->
<field name="UUID" type="uuid" indexed="true" stored="false"
multiValued="false" required="true" />
<field name="DocumentType" type="keytype" indexed="true"
stored="false"
required="true" multiValued="false"/>
<field name="DocumentLocale" type="string" indexed="true"
stored="true" required="false" />
<field name="DocumentId" type="string" indexed="false" stored="true"
required="true" />
<field name="DocumentName" type="text" indexed="true" stored="true"
required="false" />
<field name="DocumentDisplayName" type="text" indexed="true"
stored="true" required="true" />
<field name="DocumentCreateDate" type="text" indexed="false"
stored="true" required="false" />
<field name="DocumentLastUpdateDate" type="text" indexed="false"
stored="true" required="false" />
<field name="DocumentContent" type="content" indexed="true"
stored="false" required="false" />
<field name="DocumentMIME" type="text" indexed="true" stored="true"
required="false" />
<field name="DocumentTAGS" type="tags" indexed="true" stored="true"
required="false" />
<field name="URL" type="string" indexed="false" stored="true"
required="false" />
<field name="DocumentUSER" type="long" indexed="false" stored="true"
required="false" />
<field name="DocumentAuthor" type="string" indexed="false"
stored="true" required="false" />
<field name="DocumentSpace" type="long" indexed="false"
stored="true"
required="false" />
<field name="DocumentTenant" type="long" indexed="false"
stored="true"
required="false" />
<field name="DocumentDescription" type="text" indexed="false"
stored="true" required="false" />
<field name="META.Content-Type" type="string" indexed="false"
stored="false" required="false" />
<field name="DELETED" type="string" indexed="true" stored="false"
required="false" />
<!-- Extra Fields -->
<!-- Indexed general text field -->
<dynamicField name="*_text_i" type="text" indexed="true"
stored="false" />
<!-- Stored general text field -->
<dynamicField name="*_text_s" type="text" indexed="false"
stored="true" />
<!-- Indexed and stored general text field -->
<dynamicField name="*_text_is" type="text" indexed="true"
stored="true" />
<!-- Indexed general number field -->
<dynamicField name="*_long_i" type="long" indexed="true"
stored="false" />
<!-- Stored general number field -->
<dynamicField name="*_long_s" type="long" indexed="false"
stored="true" />
<!-- Indexed and stored general number field -->
<dynamicField name="*_long_is" type="long" indexed="true"
stored="true" />
<!-- Indexed general date field -->
<dynamicField name="*_date_i" type="long" indexed="true"
stored="false" />
<!-- Stored general date field -->
<dynamicField name="*_date_s" type="long" indexed="false"
stored="true" />
<!-- Indexed and stored general date field -->
<dynamicField name="*_date_is" type="long" indexed="true"
stored="true" />
<!-- Indexed general boolean field -->
<dynamicField name="*_boolean_i" type="boolean" indexed="true"
stored="false" />
<!-- Stored general boolean field -->
<dynamicField name="*_boolean_s" type="boolean" indexed="false"
stored="true" />
<!-- Indexed and stored general boolean field -->
<dynamicField name="*_boolean_is" type="boolean" indexed="true"
stored="true" />
<!-- Indexed general mult valuated number fields -->
<dynamicField name="*_number_i" type="number" indexed="true"
stored="false" />
<!-- Stored general mult valuated number fields -->
<dynamicField name="*_number_s" type="number" indexed="false"
stored="true" />
<!-- Indexed and stored general mult valuated number fields -->
<dynamicField name="*_number_is" type="number" indexed="true"
stored="true" />
<!-- catchall text field that indexes tokens both normally and in
reverse
for efficient leading wildcard queries. -->
<field name="DocumentDisplayName_rev" type="text_rev" indexed="true"
stored="false" multiValued="false" />
<field name="DocumentDescription_rev" type="text_rev" indexed="true"
stored="false" multiValued="false" />
<field name="DocumentTAGS_rev" type="tags_rev" indexed="true"
stored="false" multiValued="true" />
<field name="DocumentContent_rev" type="text_rev" indexed="true"
stored="false" multiValued="true" />
<!-- All other fields -->
<dynamicField name="*" type="string" indexed="true"
stored="false" />
</fields>
<!-- Field to use to determine and enforce document uniqueness. Unless
this
field is marked with required="false", it will be a required field
-->
<uniqueKey>UUID</uniqueKey>
<!-- field for the QueryParser to use when an explicit fieldname is
absent -->
<defaultSearchField>DocumentDisplayName</defaultSearchField>
<!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
<solrQueryParser defaultOperator="AND" />
<!-- copyField commands copy one field to another at the time a document
is added to the index. It's used either to index the same field
differently,
or to add multiple fields to the same field for easier/faster
searching. -->
<copyField source="DocumentDescription" dest="DocumentDescription_rev"
/>
<copyField source="DocumentDisplayName" dest="DocumentDisplayName_rev"
/>
<copyField source="DocumentTAGS" dest="DocumentTAGS_rev" />
<copyField source="DocumentContent" dest="DocumentContent_rev" />
</schema>
What I'm doing wrong?
On Wed, Aug 17, 2011 at 5:37 PM, Michael Ryan <mr...@moreover.com> wrote:
> Are you using the same analyzer for both type="query" and type="index"? Can
> you show us the fieldType from your schema?
>
> -Michael
>
--
Denis Wilson Souza Rosa
----------------------------------------------------
Systems Architect
mobile: +55 11 8112 8284
email: deniswsrosa@gmail.com / deniswsrosa@hotmail.com
RE: Solr Accent Insensitive and sensitive search
Posted by Michael Ryan <mr...@moreover.com>.
Are you using the same analyzer for both type="query" and type="index"? Can you show us the fieldType from your schema?
-Michael