You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Adrien Grand (JIRA)" <ji...@apache.org> on 2016/10/17 08:03:58 UTC

[jira] [Moved] (SOLR-9651) Consider tracking modification time of external file fields for faster reloading

     [ https://issues.apache.org/jira/browse/SOLR-9651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrien Grand moved LUCENE-7488 to SOLR-9651:
--------------------------------------------

    Affects Version/s:     (was: 4.10.4)
                       4.10.4
             Security: Public
          Component/s:     (was: core/index)
        Lucene Fields:   (was: New)
                  Key: SOLR-9651  (was: LUCENE-7488)
              Project: Solr  (was: Lucene - Core)

> Consider tracking modification time of external file fields for faster reloading
> --------------------------------------------------------------------------------
>
>                 Key: SOLR-9651
>                 URL: https://issues.apache.org/jira/browse/SOLR-9651
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 4.10.4
>         Environment: Linux
>            Reporter: Mike
>
> I have an index of about 4M legal documents that has pagerank boosting configured as an external file field. The external file is about 100MB in size and has one row per document in the index. Each row indicates the pagerank score of a document. When we open new searchers, this file has to get reloaded, and it creates a noticeable delay for our users -- takes several seconds to reload. 
> An idea to fix this came up in [a recent discussion in the Solr mailing list|https://www.mail-archive.com/solr-user@lucene.apache.org/msg125521.html]: Could the file only be reloaded if it has changed on disk? In other words, when new searchers are opened, could they check the modtime of the file, and avoid reloading it if the file hasn't changed? 
> In our configuration, this would be a big improvement. We only change the pagerank file once/week because computing it is intensive and new documents don't tend to have a big impact. At the same time, because we're regularly adding new documents, we do hundreds of commits per day, all of which have a delay as the (largish) external file field is reloaded. 
> Is this a reasonable improvement to request? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org