You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/02/07 17:59:21 UTC

[GitHub] [lucene] jpountz commented on pull request #649: LUCENE-10408 Better encoding of doc Ids in vectors

jpountz commented on pull request #649:
URL: https://github.com/apache/lucene/pull/649#issuecomment-1031754530


   Optimizing for the case when all docs have a value makes sense to me.
   
   > for a case when only certain documents have vectors, we do delta encoding of doc Ids.
   
   In the past we rejected changes that would consist of having the data written in a compressed fashion on disk but still uncompressed in memory.
   
   I wonder if it would be a better trade-off to keep ints uncompressed, but read them from disk directly instead of loading giant arrays in memory? Or possibly switch to something like DirectMonotonicReader if it doesn't slow down searches.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org