You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Amrit Sarkar (JIRA)" <ji...@apache.org> on 2017/08/03 16:09:00 UTC

[jira] [Commented] (SOLR-11188) Hi CPU utilization when highlighting mergecontiguous=true

    [ https://issues.apache.org/jira/browse/SOLR-11188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113003#comment-16113003 ] 

Amrit Sarkar commented on SOLR-11188:
-------------------------------------

Just skimming, nothing substantial, was looking into the code of *master* branch:

DefaultHighlighter :: 331-344
{code}
      //merge any contiguous fragments to improve readability
      if(mergeContiguousFragments)
      {
        mergeContiguousFragments(frag);
        ArrayList<TextFragment> fragTexts = new ArrayList<>();
        for (int i = 0; i < frag.length; i++)
        {
          if ((frag[i] != null) && (frag[i].getScore() > 0))
          {
            fragTexts.add(frag[i]);
          }
        }
        frag= fragTexts.toArray(new TextFragment[0]);
      }
{code}

Why do we have a *boolean* and a *function* with same name, {{mergeContiguousFragments}}? :(

Anyway the function: mergeContiguousFragments(frag ...)
{code}
private void mergeContiguousFragments(TextFragment[] frag)
  {
    boolean mergingStillBeingDone;
    if (frag.length > 1)
      do
      {
        mergingStillBeingDone = false; //initialise loop control flag
        //for each fragment, scan other frags looking for contiguous blocks
        for (int i = 0; i < frag.length; i++)
        {
          if (frag[i] == null)
          {
            continue;
          }
          //merge any contiguous blocks
          for (int x = 0; x < frag.length; x++)
          {
            if (frag[x] == null)
            {
              continue;
            }
            if (frag[i] == null)
            {
              break;
            }
            TextFragment frag1 = null;
            TextFragment frag2 = null;
            int frag1Num = 0;
            int frag2Num = 0;
            int bestScoringFragNum;
            int worstScoringFragNum;
            //if blocks are contiguous....
            if (frag[i].follows(frag[x]))
            {
              frag1 = frag[x];
              frag1Num = x;
              frag2 = frag[i];
              frag2Num = i;
            }
            else
              if (frag[x].follows(frag[i]))
              {
                frag1 = frag[i];
                frag1Num = i;
                frag2 = frag[x];
                frag2Num = x;
              }
            //merging required..
            if (frag1 != null)
            {
              if (frag1.getScore() > frag2.getScore())
              {
                bestScoringFragNum = frag1Num;
                worstScoringFragNum = frag2Num;
              }
              else
              {
                bestScoringFragNum = frag2Num;
                worstScoringFragNum = frag1Num;
              }
              frag1.merge(frag2);
              frag[worstScoringFragNum] = null;
              mergingStillBeingDone = true;
              frag[bestScoringFragNum] = frag1;
            }
          }
        }
      }
      while (mergingStillBeingDone);
  }
{code}

There is a valid condition on which the function returns, I don't see unreachable code here. Maybe on earlier versions or pardon me if I am looking at entirely different module / component.

> Hi CPU utilization when highlighting mergecontiguous=true
> ---------------------------------------------------------
>
>                 Key: SOLR-11188
>                 URL: https://issues.apache.org/jira/browse/SOLR-11188
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Erick Erickson
>
> A user's list thread indicates that Solr 6.3 has very high CPU utilization with highlighting and mergecontiguous=true. This is a marker to see if we can reproduce. Notes:
> 1> this was pre-unifiedhighlighter. 
> 2> unknown whether this is still an issue in more recent Solrs
> I'll ask the OP to comment here with additional details.
> Assigning to myself to track, I wont do any work on this for quite a while so anyone who wants to please take it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org