You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Suryasnat Das <su...@gmail.com> on 2009/03/05 19:07:06 UTC

SOLR query.

Hi,

I have some queries on SOLR fo which i need immediate resolution. A fast
help would be greatly appreciated.

a.) We know that fields are also indexed. So can we index some specific
fields(like author, id, etc) first and then do the indexing for rest of the
fields(like creation date etc) at a later time.

b.) SOLR returns the whole text content of a file during a search operation.
So how can we extract a portion of the whole content? I mean a snippet of
the content containing that search keyword. Sample code would be of great
help.

c.) What is multi core indexing?

d.) What is the number of index files that are normally created in a index
operation? What will be the expected number of index files when i index a 4
tera byte of filedata and what will be the index size for all the index
files? If anybody has worked nsuch huge volume of data then some pointers
would be of great help.

Regards
Suryasnat Das

Re: SOLR query.

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.

On Thu, Mar 5, 2009 at 11:45 PM, Erik Hatcher <er...@ehatchersolutions.com>wrote:

>
> On Mar 5, 2009, at 1:07 PM, Suryasnat Das wrote:
>
>> I have some queries on SOLR fo which i need immediate resolution. A fast
>> help would be greatly appreciated.
>>
>> a.) We know that fields are also indexed. So can we index some specific
>> fields(like author, id, etc) first and then do the indexing for rest of
>> the
>> fields(like creation date etc) at a later time.
>>
>
> You have to reindex the entire document in order to add fields to it, but
> you certainly can do so at any time.  In other words, you can just addfields to an existing document without sending in all the fields you want on
> that document.
>

 I think Erik meant that you cannot add fields to an existing document
without sending in all the fields again.
-- 
Regards,
Shalin Shekhar Mangar.

Re: SOLR query.

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

On Mar 5, 2009, at 1:07 PM, Suryasnat Das wrote:
> I have some queries on SOLR fo which i need immediate resolution. A  
> fast
> help would be greatly appreciated.
>
> a.) We know that fields are also indexed. So can we index some  
> specific
> fields(like author, id, etc) first and then do the indexing for rest  
> of the
> fields(like creation date etc) at a later time.

You have to reindex the entire document in order to add fields to it,  
but you certainly can do so at any time.  In other words, you can just  
add fields to an existing document without sending in all the fields  
you want on that document.

> b.) SOLR returns the whole text content of a file during a search  
> operation.
> So how can we extract a portion of the whole content? I mean a  
> snippet of
> the content containing that search keyword. Sample code would be of  
> great
> help.

Use Solr's highlighting capabilities:  <http://wiki.apache.org/solr/HighlightingParameters 
 >

> c.) What is multi core indexing?

Separate Solr/Lucene indexes, that all are served from a single  
instance of Solr.

> d.) What is the number of index files that are normally created in a  
> index
> operation?

Depends on the number of fields, and how you have the index  
configuration set.  If file handles ever become a problem you can set  
it to use the compound file format, but in practice I've never seen it  
be a problem.

> What will be the expected number of index files when i index a 4
> tera byte of filedata and what will be the index size for all the  
> index
> files? If anybody has worked nsuch huge volume of data then some  
> pointers
> would be of great help.

The rule of thumb is that a Lucene index is roughly 35% the size of  
the original text, assuming you are not storing the fields in Lucene,  
but only indexing it.

	Erik