You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by dhastings <dh...@wshein.com> on 2011/08/08 22:41:42 UTC

solr 3.1, not indexing entire document?

hi, i have my solr field text configured as per earlier discussion:

 <fieldType name="text" class="solr.TextField" positionIncrementGap="100"
autoGeneratePhraseQueries="true">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        
        
        
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>


and for debugging purposes i am storing the text field as well, so:


   <field name="text" type="text" indexed="true" stored="true" />

now when i do a search against a document, that i KNOW has a certain phrase,
in this case "official handbook of the Federal Government"

my query looks like:

<result name="response" numFound="0" start="0" maxScore="0.0"/><lst
name="debug"><str name="rawquerystring">id:062085.1 AND text:"official
handbook of the Federal Government"</str><str name="querystring">id:062085.1
AND text:"official handbook of the Federal Government"</str><str
name="parsedquery">+id:062085.1 +PhraseQuery(text:"official handbook of the
federal government")</str><str name="parsedquery_toString">+id:062085.1
+text:"official handbook of the federal government"</str>


i get 0 results, so, when i search just for that id, and i get the result:


way way at the end sure enough is the string

http://qihealing.net/doc.txt output 

is there a document size limit or is it the fact that im sending to solr
using solrj and its too large?






--
View this message in context: http://lucene.472066.n3.nabble.com/solr-3-1-not-indexing-entire-document-tp3236719p3236719.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr 3.1, not indexing entire document?

Posted by dhastings <dh...@wshein.com>.

that was it... thanks.  obviously the document is well over 2 mgs.

--
View this message in context: http://lucene.472066.n3.nabble.com/solr-3-1-not-indexing-entire-document-tp3236719p3236773.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr 3.1, not indexing entire document?

Posted by Markus Jelsma <ma...@openindex.io>.

Check your maxFieldLength settting.

> hi, i have my solr field text configured as per earlier discussion:
> 
>  <fieldType name="text" class="solr.TextField" positionIncrementGap="100"
> autoGeneratePhraseQueries="true">
>       <analyzer type="index">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> 
> 
> 
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> 
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>       </analyzer>
>     </fieldType>
> 
> 
> and for debugging purposes i am storing the text field as well, so:
> 
> 
>    <field name="text" type="text" indexed="true" stored="true" />
> 
> now when i do a search against a document, that i KNOW has a certain
> phrase, in this case "official handbook of the Federal Government"
> 
> my query looks like:
> 
> <result name="response" numFound="0" start="0" maxScore="0.0"/><lst
> name="debug"><str name="rawquerystring">id:062085.1 AND text:"official
> handbook of the Federal Government"</str><str
> name="querystring">id:062085.1 AND text:"official handbook of the Federal
> Government"</str><str
> name="parsedquery">+id:062085.1 +PhraseQuery(text:"official handbook of the
> federal government")</str><str name="parsedquery_toString">+id:062085.1
> +text:"official handbook of the federal government"</str>
> 
> 
> i get 0 results, so, when i search just for that id, and i get the result:
> 
> 
> way way at the end sure enough is the string
> 
> http://qihealing.net/doc.txt output
> 
> is there a document size limit or is it the fact that im sending to solr
> using solrj and its too large?
> 
> 
> 
> 
> 
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/solr-3-1-not-indexing-entire-document-t
> p3236719p3236719.html Sent from the Solr - User mailing list archive at
> Nabble.com.