You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by adityab <ad...@yahoo.com> on 2013/11/19 15:55:07 UTC

External File field Reload option

Hi, 
I have been using external file field (eff) for holding rank of the document
which gets updated every day based on different stats collected by the
system. Once the rank is computed the new files are pushed to Master which
will eventually replicate to slaves on next commit. 

Our eff file has around 1.6M lines a simple key value pare. Its roughly
about 16MB. Its been observed that loading this file at first takes around
192 sec. I agree this can be done at the start of the server and should not
impact the performance while serving traffic. (We have 10 such fields, file
per zone).

Now documents are pushed to Master every 2 hrs in batches. Eff is just
pushed once a day. As we apply commit every 2hrs, On slaves when new reader
is opened after replication it takes a long time to warmup because it has to
load the eff file again. 

Curious to know if the file has not changed and resides outside index, is
there a way in solar to check if the eff file is actually modified before
trying to reload it? 

Any other suggestions?

 



--
View this message in context: http://lucene.472066.n3.nabble.com/External-File-field-Reload-option-tp4101929.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: External File field Reload option

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Aditya,

If you commit, docnums are supposed to be changed, hence the file should be
reloaded.

There might be few alternative approaches to address this problem, but they
are  really  bloody hacks, you know.

Hold on, if docs are pushed in few hours, but file is changed daily, can't
you mix that boosting values from files into documents. It might force you
to index more, and/or slightly delay boosting update, but it's the only
easy win here.


On Tue, Nov 19, 2013 at 6:55 PM, adityab <ad...@yahoo.com> wrote:

> Hi,
> I have been using external file field (eff) for holding rank of the
> document
> which gets updated every day based on different stats collected by the
> system. Once the rank is computed the new files are pushed to Master which
> will eventually replicate to slaves on next commit.
>
> Our eff file has around 1.6M lines a simple key value pare. Its roughly
> about 16MB. Its been observed that loading this file at first takes around
> 192 sec. I agree this can be done at the start of the server and should not
> impact the performance while serving traffic. (We have 10 such fields, file
> per zone).
>
> Now documents are pushed to Master every 2 hrs in batches. Eff is just
> pushed once a day. As we apply commit every 2hrs, On slaves when new reader
> is opened after replication it takes a long time to warmup because it has
> to
> load the eff file again.
>
> Curious to know if the file has not changed and resides outside index, is
> there a way in solar to check if the eff file is actually modified before
> trying to reload it?
>
> Any other suggestions?
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/External-File-field-Reload-option-tp4101929.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
 <mk...@griddynamics.com>