You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "jointcc2 ." <mk...@gmail.com> on 2020/04/07 18:44:51 UTC

Question Regarding Computing Vocabulary Size

Hello, I am a master student currently working on a search engine project
on BM25similarity. My question is about computing the length of vocabulary
size of a single document. I have looked through the code base but has not
found anything useful for that specific application. I am wondering if
there is a way to compute specifically the length of the set of distinct
terms for a single document? Please let me know if you can help me with
this. Many thanks.





Michael