You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "kkewwei (Jira)" <ji...@apache.org> on 2019/12/17 03:31:00 UTC

[jira] [Updated] (LUCENE-9096) Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler

     [ https://issues.apache.org/jira/browse/LUCENE-9096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kkewwei updated LUCENE-9096:
----------------------------
    Description: 
In CompressingTermVectorsWriter.flushOffsets,  we count 

sumPos and sumOffsets by the way
{code:java}
for (int i = 0; i < fd.numTerms; ++i) { 
  int previousPos = 0;
  int previousOff = 0;
  for (int j = 0; j < fd.freqs[i]; ++j) { 
    final int position = positionsBuf[fd.posStart + pos];
    final int startOffset = startOffsetsBuf[fd.offStart + pos];
    sumPos[fieldNumOff] += position - previousPos; 
    sumOffsets[fieldNumOff] += startOffset - previousOff; 
    previousPos = position;
    previousOff = startOffset;
    ++pos;
  }
}
{code}
we always use the position - previousPos,  it can be summarized like this:  (position5-position4)+(position4-position3)+(position3-position2)+(position2-position1).

If we should simplify it: position5-position1

 

  was:
In CompressingTermVectorsWriter.flushOffsets,  we count 

sumPos and sumOffsets by the way
{code:java}
for (int i = 0; i < fd.numTerms; ++i) { 
  int previousPos = 0;
  int previousOff = 0;
  for (int j = 0; j < fd.freqs[i]; ++j) { 
    final int position = positionsBuf[fd.posStart + pos];
    final int startOffset = startOffsetsBuf[fd.offStart + pos];
    sumPos[fieldNumOff] += position - previousPos; 
    sumOffsets[fieldNumOff] += startOffset - previousOff; 
    previousPos = position;
    previousOff = startOffset;
    ++pos;
  }
}
{code}
we always use the position - previousPos,  it can be summarized like this:                 (position5-position4)+(position4-position3)+(position3-position2)+(position2-position1).

If we should simplify it: position5-position1

 


> Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-9096
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9096
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs
>    Affects Versions: 8.2
>            Reporter: kkewwei
>            Priority: Major
>
> In CompressingTermVectorsWriter.flushOffsets,  we count 
> sumPos and sumOffsets by the way
> {code:java}
> for (int i = 0; i < fd.numTerms; ++i) { 
>   int previousPos = 0;
>   int previousOff = 0;
>   for (int j = 0; j < fd.freqs[i]; ++j) { 
>     final int position = positionsBuf[fd.posStart + pos];
>     final int startOffset = startOffsetsBuf[fd.offStart + pos];
>     sumPos[fieldNumOff] += position - previousPos; 
>     sumOffsets[fieldNumOff] += startOffset - previousOff; 
>     previousPos = position;
>     previousOff = startOffset;
>     ++pos;
>   }
> }
> {code}
> we always use the position - previousPos,  it can be summarized like this:  (position5-position4)+(position4-position3)+(position3-position2)+(position2-position1).
> If we should simplify it: position5-position1
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org