You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/11/22 20:12:30 UTC

[GitHub] [lucene] hendrikmuhs commented on a change in pull request #460: LUCENE-10247 - reduce size of FSTs by relative coding

hendrikmuhs commented on a change in pull request #460:
URL: https://github.com/apache/lucene/pull/460#discussion_r754601969



##########
File path: lucene/core/src/java/org/apache/lucene/util/fst/FST.java
##########
@@ -1000,6 +1027,98 @@ private void writePresenceBits(
     assert bytePos - dest == numPresenceBytes;
   }
 
+  private long estimateNodeAddress(

Review comment:
       That's the trickiest part of the whole idea. I need to know [`thisNodeAddress`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/fst/FST.java#L779) before writing the state. But because everything uses variable length encodings, is based on flags, might write extra fields, etc. this is hard. I considered using the position of the arc, but, the compiler reverses the bytes it wrote.
   
   So what I ended up with is pre-calculating the node address ("estimate" is therefore wrong, feel free to suggest a better name, it must be exact (or return `0`, see below), so rather "preCalculateNodeAddress").
   
   Good to know: the heuristic is allowed to fail, in which case it returns `0` and relative coding isn't used.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org