You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/11/09 22:19:47 UTC

[GitHub] [lucene] jtibshirani opened a new pull request #432: LUCENE-10228: Ensure PerFieldKnnVectorsFormat uses right format name

jtibshirani opened a new pull request #432:
URL: https://github.com/apache/lucene/pull/432


   Before when creating a KnnVectorsWriter for merging, we consulted the existing
   "PER_FIELD_SUFFIX_KEY" attribute to determine the format's per-field suffix.
   This isn't correct since we could be using a new codec (that produces different
   formats/ suffixes).
   
   This commit modifies TestPerFieldDocValuesFormat#testMergeUsesNewFormat to
   trigger the problem. Without the fix we it throws an error like
   "java.nio.file.FileAlreadyExistsException: File
   "_3_Lucene90HnswVectorsFormat_0.vem" was already written to."


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jtibshirani commented on a change in pull request #432: LUCENE-10228: Ensure PerFieldKnnVectorsFormat uses right format name

Posted by GitBox <gi...@apache.org>.
jtibshirani commented on a change in pull request #432:
URL: https://github.com/apache/lucene/pull/432#discussion_r746919354



##########
File path: lucene/core/src/java/org/apache/lucene/codecs/perfield/PerFieldKnnVectorsFormat.java
##########
@@ -123,25 +123,17 @@ private KnnVectorsWriter getInstance(FieldInfo field) throws IOException {
       final String formatName = format.getName();
 
       field.putAttribute(PER_FIELD_FORMAT_KEY, formatName);
-      Integer suffix = null;
+      Integer suffix;
 
       WriterAndSuffix writerAndSuffix = formats.get(format);
       if (writerAndSuffix == null) {
         // First time we are seeing this format; create a new instance
 
-        String suffixAtt = field.getAttribute(PER_FIELD_SUFFIX_KEY);

Review comment:
       No problem at all, this per-field logic is really tricky.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jtibshirani commented on a change in pull request #432: LUCENE-10228: Ensure PerFieldKnnVectorsFormat uses right format name

Posted by GitBox <gi...@apache.org>.
jtibshirani commented on a change in pull request #432:
URL: https://github.com/apache/lucene/pull/432#discussion_r746091086



##########
File path: lucene/core/src/java/org/apache/lucene/codecs/perfield/PerFieldKnnVectorsFormat.java
##########
@@ -123,25 +123,17 @@ private KnnVectorsWriter getInstance(FieldInfo field) throws IOException {
       final String formatName = format.getName();
 
       field.putAttribute(PER_FIELD_FORMAT_KEY, formatName);
-      Integer suffix = null;
+      Integer suffix;
 
       WriterAndSuffix writerAndSuffix = formats.get(format);
       if (writerAndSuffix == null) {
         // First time we are seeing this format; create a new instance
 
-        String suffixAtt = field.getAttribute(PER_FIELD_SUFFIX_KEY);

Review comment:
       I think this was accidentally copied over from `PerFieldDocValuesFormat`. The logic there only applies for doc value updates.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jtibshirani commented on pull request #432: LUCENE-10228: Ensure PerFieldKnnVectorsFormat uses right format name

Posted by GitBox <gi...@apache.org>.
jtibshirani commented on pull request #432:
URL: https://github.com/apache/lucene/pull/432#issuecomment-965503435


   I'll backport right now to 9.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on a change in pull request #432: LUCENE-10228: Ensure PerFieldKnnVectorsFormat uses right format name

Posted by GitBox <gi...@apache.org>.
msokolov commented on a change in pull request #432:
URL: https://github.com/apache/lucene/pull/432#discussion_r746801460



##########
File path: lucene/core/src/java/org/apache/lucene/codecs/perfield/PerFieldKnnVectorsFormat.java
##########
@@ -123,25 +123,17 @@ private KnnVectorsWriter getInstance(FieldInfo field) throws IOException {
       final String formatName = format.getName();
 
       field.putAttribute(PER_FIELD_FORMAT_KEY, formatName);
-      Integer suffix = null;
+      Integer suffix;
 
       WriterAndSuffix writerAndSuffix = formats.get(format);
       if (writerAndSuffix == null) {
         // First time we are seeing this format; create a new instance
 
-        String suffixAtt = field.getAttribute(PER_FIELD_SUFFIX_KEY);

Review comment:
       yes, my bad - nice catch, thanks for fixing. I guess we only really ever tested with a single vector field per document.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jtibshirani merged pull request #432: LUCENE-10228: Ensure PerFieldKnnVectorsFormat uses right format name

Posted by GitBox <gi...@apache.org>.
jtibshirani merged pull request #432:
URL: https://github.com/apache/lucene/pull/432


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org