You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/03/21 16:53:14 UTC

[GitHub] [lucene] rmuir opened a new pull request #28: LUCENE-9827: avoid wasteful recompression for small segments

rmuir opened a new pull request #28:
URL: https://github.com/apache/lucene/pull/28


   Require that the segment has enough dirty documents to create a clean
   chunk before recompressing during merge, there must be at least maxChunkSize.
   
   This prevents wasteful recompression with small flushes (e.g. every
   document): we ensure recompression achieves some "permanent" progress.
   
   Expose maxDocsPerChunk as a parameter for Term vectors too, matching the
   stored fields format. This allows for easy testing.
   
   See JIRA for more details: https://issues.apache.org/jira/browse/LUCENE-9827?focusedCommentId=17305712&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17305712
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #28: LUCENE-9827: avoid wasteful recompression for small segments

Posted by GitBox <gi...@apache.org>.
rmuir commented on pull request #28:
URL: https://github.com/apache/lucene/pull/28#issuecomment-805814747


   Thanks @jpountz for the commit! Code is simpler and does fine with testing I have thrown at it. If anything, it seems faster. I indexed 1M docs (flushing every doc), It went 20% faster with https://github.com/apache/lucene/pull/28/commits/4856b6f0fe605e4591910b227dd87b10b137cc83 than without. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir merged pull request #28: LUCENE-9827: avoid wasteful recompression for small segments

Posted by GitBox <gi...@apache.org>.
rmuir merged pull request #28:
URL: https://github.com/apache/lucene/pull/28


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org