You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Shawn Heisey <ap...@elyograg.org> on 2017/01/12 23:59:05 UTC

A question about DV operation

I have an idea for a Solr feature, but to know whether it's at all
viable, I need a question about Lucene operation answered.

In recent versions of Solr, if a field is not stored, not indexed, but
does have docValues, the originally indexed data sent for that field
will be returned in search results.  In older versions (not sure which
ones) a field must be stored to be returned.

Let's say that such a field contains a very large amount of data in
every document.  Normally, this would affect OS disk cache efficiency
for general queries, because the docValues data for the field would need
to be read in order to be included in search results.  Reading that
large amount of data can pollute the disk cache.  If the system is in a
low-memory situation, that can affect performance.

What happens if every query has an explicit list of fields to return in
results, and the list of fields does NOT include this field that
contains a large amount of data in docValues?  Does this mean that the
docValues data for the field I've mentioned is never read, and has no
effect on OS disk cache efficiency?  Or would Lucene read the docValues
data even though it doesn't include it in results?

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: A question about DV operation

Posted by Erick Erickson <er...@gmail.com>.
Shawn:

tangentially related is SOLR-3191 (field exclusion from fl) which has
been in my queue for a very long time and I'd be _delighted_ if
someone took it away and gave it some attention. Rather than having to
specify all the fields you want just to omit the one big one, you
should be able to specify * but then exclude the big one.

Honest, I've had every intention of getting to this for...a very long time ;(

Erick

On Fri, Jan 13, 2017 at 12:52 AM, Ishan Chattopadhyaya
<ic...@gmail.com> wrote:
>> What happens if every query has an explicit list of fields to return in
>> results, and the list of fields does NOT include this field that
>> contains a large amount of data in docValues? Does this mean that the
>> docValues data for the field I've mentioned is never read, and has no
>> effect on OS disk cache efficiency?
>
> That is my understanding.
>
>
> On Fri, Jan 13, 2017 at 5:29 AM, Shawn Heisey <ap...@elyograg.org> wrote:
>>
>> I have an idea for a Solr feature, but to know whether it's at all
>> viable, I need a question about Lucene operation answered.
>>
>> In recent versions of Solr, if a field is not stored, not indexed, but
>> does have docValues, the originally indexed data sent for that field
>> will be returned in search results.  In older versions (not sure which
>> ones) a field must be stored to be returned.
>>
>> Let's say that such a field contains a very large amount of data in
>> every document.  Normally, this would affect OS disk cache efficiency
>> for general queries, because the docValues data for the field would need
>> to be read in order to be included in search results.  Reading that
>> large amount of data can pollute the disk cache.  If the system is in a
>> low-memory situation, that can affect performance.
>>
>> What happens if every query has an explicit list of fields to return in
>> results, and the list of fields does NOT include this field that
>> contains a large amount of data in docValues?  Does this mean that the
>> docValues data for the field I've mentioned is never read, and has no
>> effect on OS disk cache efficiency?  Or would Lucene read the docValues
>> data even though it doesn't include it in results?
>>
>> Thanks,
>> Shawn
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: A question about DV operation

Posted by Ishan Chattopadhyaya <ic...@gmail.com>.
> What happens if every query has an explicit list of fields to return in
> results, and the list of fields does NOT include this field that
> contains a large amount of data in docValues? Does this mean that the
> docValues data for the field I've mentioned is never read, and has no
> effect on OS disk cache efficiency?

That is my understanding.


On Fri, Jan 13, 2017 at 5:29 AM, Shawn Heisey <ap...@elyograg.org> wrote:

> I have an idea for a Solr feature, but to know whether it's at all
> viable, I need a question about Lucene operation answered.
>
> In recent versions of Solr, if a field is not stored, not indexed, but
> does have docValues, the originally indexed data sent for that field
> will be returned in search results.  In older versions (not sure which
> ones) a field must be stored to be returned.
>
> Let's say that such a field contains a very large amount of data in
> every document.  Normally, this would affect OS disk cache efficiency
> for general queries, because the docValues data for the field would need
> to be read in order to be included in search results.  Reading that
> large amount of data can pollute the disk cache.  If the system is in a
> low-memory situation, that can affect performance.
>
> What happens if every query has an explicit list of fields to return in
> results, and the list of fields does NOT include this field that
> contains a large amount of data in docValues?  Does this mean that the
> docValues data for the field I've mentioned is never read, and has no
> effect on OS disk cache efficiency?  Or would Lucene read the docValues
> data even though it doesn't include it in results?
>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>