You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by GitBox <gi...@apache.org> on 2020/09/23 12:19:39 UTC

[GitHub] [lucenenet] theolivenbaum opened a new issue #350: Change ScoreDoc to struct?

theolivenbaum opened a new issue #350:
URL: https://github.com/apache/lucenenet/issues/350


   Doing a random check with a memory profiler on the search benchmark test, and it seems like a lot of the objects being allocated are ScoreDoc instances:
   
   ![image](https://user-images.githubusercontent.com/8791811/94011329-7c009480-fda7-11ea-82de-88d7be9502b6.png)
   
   Is there any good reason to not refactor this as a struct?
   
   The code is so simple that it feels like an easy win to change it to a struct:
   
   ```csharp
       public class ScoreDoc
       {
           public float Score { get; set; } // LUCENENET NOTE: For some reason, this was not readonly - should it be?
           public int Doc { get; set; } // LUCENENET NOTE: For some reason, this was not readonly - should it be?
           public int ShardIndex { get; set; } // LUCENENET NOTE: For some reason, this was not readonly - should it be?
   
           public ScoreDoc(int doc, float score) : this(doc, score, -1) { }
   
           public ScoreDoc(int doc, float score, int shardIndex)
           {
               this.Doc = doc;
               this.Score = score;
               this.ShardIndex = shardIndex;
           }
       }
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] theolivenbaum commented on issue #350: Change ScoreDoc to struct?

Posted by GitBox <gi...@apache.org>.
theolivenbaum commented on issue #350:
URL: https://github.com/apache/lucenenet/issues/350#issuecomment-697716987


   Interesting - because we use our own ResultsCollector, this is actually not relevant for our usage of Lucene.NET at all - so I'll leave it as is, as there is a path to optimize it already by not using the official TopDocs results collector


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] theolivenbaum closed issue #350: Change ScoreDoc to struct?

Posted by GitBox <gi...@apache.org>.
theolivenbaum closed issue #350:
URL: https://github.com/apache/lucenenet/issues/350


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] theolivenbaum edited a comment on issue #350: Change ScoreDoc to struct?

Posted by GitBox <gi...@apache.org>.
theolivenbaum edited a comment on issue #350:
URL: https://github.com/apache/lucenenet/issues/350#issuecomment-697484139


   Guess that's exactly the cost of doing so - but i've the impression there are already a lot of places on the Lucene source code that rely on the similar pattern, instead of using generics + interfaces to enforce this at compile time:
   
   ```csharp
   if(!(obj is Class)) throw new Exception("Expected type X");
   ```
   
   This could be one case where the switch is worth, as currently Lucene allocates one ScoreDoc object for every search hit that needs to be scored, per search... But obviously better to test first :)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] theolivenbaum commented on issue #350: Change ScoreDoc to struct?

Posted by GitBox <gi...@apache.org>.
theolivenbaum commented on issue #350:
URL: https://github.com/apache/lucenenet/issues/350#issuecomment-697351627


   It seems like there are only 3 classes that derive from `ScoreDoc`: `OneGroup`, `FieldDoc`, `FieldValueHitQueue.Entry`
   
   One possible way to go about this (as we can't use generics, and using an interface on the struct will result in boxing anyway), would be to add two fields to ScoreDoc, one to hold the expected content type, and the other to hold the content:
   
   ````csharp
   public struct ScoreDoc
   {
       private byte _contentType;
       private object _content;
   
       public float Score { get; }
       public int Doc { get; }
       public int ShardIndex { get; } 
   
       public bool IsNull() => _contentType == 0;
       public bool IsNotNull() => _contentType != 0;
       public bool IsFieldDoc() => _contentType == _fieldDocType;
       public bool IsEntry() => _contentType == _entryType || _contentType == _oneGroupType; // OneGroupContent derives from Entry
       public bool IsOneGroup() => _contentType == _oneGroupType ;
   
       private const byte _scoreDocType = 1;
       private const byte _fieldDocType = 2;
       private const byte _entryType = 3;
       private const byte _oneGroupType = 4;
   
       public FieldDocContent AsFieldDoc() 
       {
           if(IsFieldDoc()) return (FieldDocContent)_content;
           throw new Exception("Invalid type");
      }
   
       public EntryContent AsEntry() 
       {
           if(IsEntry()) return (EntryContent )_content;
           throw new Exception("Invalid type");
      }
   
       public OneGroupContent AsOneGroup() 
       {
           if(IsOneGroup()) return (OneGroupContent )_content;
           throw new Exception("Invalid type");
      }
   }
   ````
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] theolivenbaum commented on issue #350: Change ScoreDoc to struct?

Posted by GitBox <gi...@apache.org>.
theolivenbaum commented on issue #350:
URL: https://github.com/apache/lucenenet/issues/350#issuecomment-697484139


   Guess that's exactly the cost of doing so - but i've the impression there are already a lot of places on the Lucene source code that rely on the similar pattern, instead of using generics + interfaces to enforce this at compile time:
   
   ```csharp
   if(!(obj is Class)) throw new Exception("Expected type X");
   ```
   
   IMO, this could be one case where the switch is worth, as currently Lucene allocates one ScoreDoc object for every search hit that needs to be scored, per search... But obviously better to test first :)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] NightOwl888 commented on issue #350: Change ScoreDoc to struct?

Posted by GitBox <gi...@apache.org>.
NightOwl888 commented on issue #350:
URL: https://github.com/apache/lucenenet/issues/350#issuecomment-697429988


   Given the fact that Lucene relies on inheritance in this case to extend a data container with additional fields, it feels like we are trying to force a struct into the design. Specifically, we are breaking the is-a relationship in the inheritance model for all of the subclasses, which would make it more difficult to use (assuming that it doesn't need to be cast often by users, in that case we may actually be helping).
   
   The usage of this class is such that it acts as a data delivery mechanism, similar to the tables in Entity Framework, which are classes.
   
   As such, IMO this probably isn't a good candidate for a struct.
   
   I am curious to hear other opinions on this case, though. Especially from users. This does seem to have potential to make a big performance impact.
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org