You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by solrfan <a2...@jnxjn.com> on 2011/05/06 17:33:41 UTC

Whole unfiltered content in response document field

Hi, I have a question to the content of the document fields. My configuration
is ok so far, I index a database with DIH and have configured a index
analyser as folow:

<analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="stopwords.txt"
                enablePositionIncrements="true"
                />
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>

...

 <fields>
   <field name="id" type="int" indexed="true" stored="true" required="true"
/> 
   <field name="text" type="text" indexed="true" stored="true"/>
 </fields>

On the analysis view, my filters work poperly. On the end of the filter
chain I have only interest tokens. But when I search with Solr, I become as
a response the whole content of the indexed databse field. The field
contains stopwords, whitespaces, upercases and so on. I search for
stopwords, and I can find them. I would expect, I find in the response
document only the filtered content in the field and not the original raw
content that I would to index.

Is this a normal behaviour? Do I understand Solr right?

Many thanks!


--
View this message in context: http://lucene.472066.n3.nabble.com/Whole-unfiltered-content-in-response-document-field-tp2908685p2908685.html
Sent from the Lucene - General mailing list archive at Nabble.com.