You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Burton-West, Tom" <tb...@umich.edu> on 2009/03/25 18:50:17 UTC

Can TermIndexInterval be set in Solr?

Hello all,

We are experimenting with the ShingleFilter with a very large document set (1 million full-text books). Because the ShingleFilter indexes every word pair as a token, the number of unique terms increases tremendously.  In our experiments so far the tii and tis files are getting very large and the tii file will eventually be too large to fit into memory.  If we set the TermIndexInterval to a larger number than the default 128, the tii file size should go down.  Is it possible to set this somehow through Solr configuration or do we need to modify the code somewhere and call IndexWriter.setTermIndexInterval?


Tom

Tom Burton-West
Digital Library Production Services
University of Michigan Library

 

Re: Can TermIndexInterval be set in Solr?

Posted by Otis Gospodnetic <ot...@yahoo.com>.
I think it's the later.  I don't think the term interval is exposed anywhere.  If you expose it through the config and provide a patch, I think we can add this to the core quickly.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: "Burton-West, Tom" <tb...@umich.edu>
> To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
> Cc: "Farber, Phillip" <pf...@umich.edu>; "Dueber, William" <du...@umich.edu>
> Sent: Wednesday, March 25, 2009 1:50:17 PM
> Subject: Can TermIndexInterval be set in Solr?
> 
> Hello all,
> 
> We are experimenting with the ShingleFilter with a very large document set (1 
> million full-text books). Because the ShingleFilter indexes every word pair as a 
> token, the number of unique terms increases tremendously.  In our experiments so 
> far the tii and tis files are getting very large and the tii file will 
> eventually be too large to fit into memory.  If we set the TermIndexInterval to 
> a larger number than the default 128, the tii file size should go down.  Is it 
> possible to set this somehow through Solr configuration or do we need to modify 
> the code somewhere and call IndexWriter.setTermIndexInterval?
> 
> 
> Tom
> 
> Tom Burton-West
> Digital Library Production Services
> University of Michigan Library