You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Uwe Schindler (Issue Comment Edited) (JIRA)" <ji...@apache.org> on 2011/12/11 17:36:47 UTC

[jira] [Issue Comment Edited] (LUCENE-3638) IndexReader.document always return a doc with all the stored fields loaded. And this can be slow for the indexed document contain huge fields

    [ https://issues.apache.org/jira/browse/LUCENE-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167131#comment-13167131 ] 

Uwe Schindler edited comment on LUCENE-3638 at 12/11/11 4:36 PM:
-----------------------------------------------------------------

bq. 3. i do not think i can store only the interesting part because i do not know which is interesting part at index time. For example, the digest part of the search results is generated according to the query of somebody's.

Digest is the wrong word, this confused here lots of people. The use case you talk about is "highlighting". I agree for very large fields this is expensive.

In fact your patch does not handle this case and I agree with the others as it's to heavy to implement and adds back the crazy complexity we had with lazy fields & co.
                
      was (Author: thetaphi):
    bq. 3. i do not think i can store only the interesting part because i do not know which is interesting part at index time. For example, the digest part of the search results is generated according to the query of somebody's.

Digest is the wrong word, this confused here lots of people. The use case you talk about is "highlighting". I agree for very large fields this is expensive.

In fact your patch does not handle this case and I agree it's to heavy to implement and adds back the crazy complexity we had with lazy fields & co.
                  
> IndexReader.document always return a doc with all the stored fields loaded. And this can be slow for the indexed document contain huge fields
> ---------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3638
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3638
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index, core/search
>    Affects Versions: 4.0
>         Environment: 64bit linux java 1.6
>            Reporter: peter chang
>            Priority: Minor
>              Labels: patch
>             Fix For: 4.0
>
>         Attachments: doc.fields.patch
>
>
> when generating digest for some documents with huge fields, it should be unnecessary to load the field but just interesting part of the field with the offset information. but indexreader always return the whole field content. afterward, the customized storedfieldsreader will got a repeated loading

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org