You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/08/19 16:57:14 UTC

[GitHub] [lucene] msokolov opened a new pull request, #1074: Fix for bad cast when sorting a KnnVectors index over BytesRef

msokolov opened a new pull request, #1074:
URL: https://github.com/apache/lucene/pull/1074

   Thanks @jtibshirani for noticing this one! Clearly we were missing some tests, so I beefed up BaseKnnVectorsFormatTestCase a bit, adding a specific sorted index test over bytes and also a testRandomBytes. Probably some additional coverage would be a good idea.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #1074: Fix for bad cast when sorting a KnnVectors index over BytesRef

Posted by GitBox <gi...@apache.org>.
msokolov commented on PR #1074:
URL: https://github.com/apache/lucene/pull/1074#issuecomment-1221307962

   hmm new testRandomBytes failed with seed CBC12662A8273C68


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on a diff in pull request #1074: Fix for bad cast when sorting a KnnVectors index over BytesRef

Posted by GitBox <gi...@apache.org>.
msokolov commented on code in PR #1074:
URL: https://github.com/apache/lucene/pull/1074#discussion_r951374398


##########
lucene/codecs/src/java/org/apache/lucene/codecs/simpletext/SimpleTextKnnVectorsWriter.java:
##########
@@ -76,6 +77,10 @@ public class SimpleTextKnnVectorsWriter extends BufferingKnnVectorsWriter {
   public void writeField(FieldInfo fieldInfo, KnnVectorsReader knnVectorsReader, int maxDoc)
       throws IOException {
     VectorValues vectors = knnVectorsReader.getVectorValues(fieldInfo.name);
+    if (fieldInfo.getVectorEncoding() != VectorEncoding.FLOAT32) {

Review Comment:
   yeah, oops -- and this exception is breaking all kinds of tests. I opened https://github.com/apache/lucene/issues/1587



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jtibshirani commented on a diff in pull request #1074: Fix for bad cast when sorting a KnnVectors index over BytesRef

Posted by GitBox <gi...@apache.org>.
jtibshirani commented on code in PR #1074:
URL: https://github.com/apache/lucene/pull/1074#discussion_r950897203


##########
lucene/codecs/src/java/org/apache/lucene/codecs/simpletext/SimpleTextKnnVectorsWriter.java:
##########
@@ -76,6 +77,10 @@ public class SimpleTextKnnVectorsWriter extends BufferingKnnVectorsWriter {
   public void writeField(FieldInfo fieldInfo, KnnVectorsReader knnVectorsReader, int maxDoc)
       throws IOException {
     VectorValues vectors = knnVectorsReader.getVectorValues(fieldInfo.name);
+    if (fieldInfo.getVectorEncoding() != VectorEncoding.FLOAT32) {

Review Comment:
   Since `VectorEncoding` belongs to `FieldInfo`, it's expected that any codec implementations will support it. (It's just not supported by the old HNSW codecs, which makes sense). So it seems like we should update `SimpleTextKnnVectorsWriter` to support byte encodings. Maybe we could at least file an issue about it to track it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov merged pull request #1074: Fix for bad cast when sorting a KnnVectors index over BytesRef

Posted by GitBox <gi...@apache.org>.
msokolov merged PR #1074:
URL: https://github.com/apache/lucene/pull/1074


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org