You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/06/02 04:29:40 UTC

[GitHub] [lucene] jtibshirani opened a new pull request #164: LUCENE-9905: Allow Lucene90Codec to be configured with a per-field vector format

jtibshirani opened a new pull request #164:
URL: https://github.com/apache/lucene/pull/164


   Previously only AssertingCodec could handle a per-field vector format. This PR
   also strengthens the checks in TestPerFieldVectorFormat.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org

[GitHub] [lucene] jtibshirani commented on pull request #164: LUCENE-9905: Allow Lucene90Codec to be configured with a per-field vector format

Posted by GitBox <gi...@apache.org>.

jtibshirani commented on pull request #164:
URL: https://github.com/apache/lucene/pull/164#issuecomment-853136935


   Thanks for reviewing !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org

[GitHub] [lucene] msokolov commented on a change in pull request #164: LUCENE-9905: Allow Lucene90Codec to be configured with a per-field vector format

Posted by GitBox <gi...@apache.org>.

msokolov commented on a change in pull request #164:
URL: https://github.com/apache/lucene/pull/164#discussion_r643912033



##########
File path: lucene/core/src/test/org/apache/lucene/codecs/perfield/TestPerFieldVectorFormat.java
##########
@@ -52,53 +52,54 @@ protected Codec getCodec() {
     return codec;
   }
 
-  // just a simple trivial test
   public void testTwoFieldsTwoFormats() throws IOException {
     Analyzer analyzer = new MockAnalyzer(random());
 
     try (Directory directory = newDirectory()) {
       // we don't use RandomIndexWriter because it might add more values than we expect !!!!1
       IndexWriterConfig iwc = newIndexWriterConfig(analyzer);
-      final VectorFormat fast = TestUtil.getDefaultVectorFormat();
-      final VectorFormat slow = VectorFormat.forName("Asserting");
+      VectorFormat defaultFormat = TestUtil.getDefaultVectorFormat();
+      VectorFormat emptyFormat = VectorFormat.EMPTY;
       iwc.setCodec(
           new AssertingCodec() {
             @Override
             public VectorFormat getVectorFormatForField(String field) {
-              if ("v1".equals(field)) {
-                return fast;
+              if ("empty".equals(field)) {
+                return emptyFormat;
               } else {
-                return slow;
+                return defaultFormat;
               }
             }
           });
+
       try (IndexWriter iwriter = new IndexWriter(directory, iwc)) {
         Document doc = new Document();
         doc.add(newTextField("id", "1", Field.Store.YES));
-        doc.add(new VectorField("v1", new float[] {1, 2, 3}));
+        doc.add(new VectorField("field", new float[] {1, 2, 3}));
         iwriter.addDocument(doc);
-        doc = new Document();
+        iwriter.commit();
+
+        // Check that we use the empty vector format, which doesn't support writes
+        doc.clear();
         doc.add(newTextField("id", "2", Field.Store.YES));
-        doc.add(new VectorField("v2", new float[] {4, 5, 6}));
-        iwriter.addDocument(doc);
+        doc.add(new VectorField("empty", new float[] {4, 5, 6}));
+        expectThrows(
+            RuntimeException.class,
+            () -> {
+              iwriter.addDocument(doc);
+              iwriter.commit();
+            });
       }
 
-      // Now search the index:
+      // Now search for the field that was successfully indexed
       try (IndexReader ireader = DirectoryReader.open(directory)) {
         TopDocs hits1 =
             ireader
                 .leaves()
                 .get(0)
                 .reader()
-                .searchNearestVectors("v1", new float[] {1, 2, 3}, 10, 1);
+                .searchNearestVectors("field", new float[] {1, 2, 3}, 10, 1);
         assertEquals(1, hits1.scoreDocs.length);
-        TopDocs hits2 =

Review comment:
       weird, what was this doing here :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org

[GitHub] [lucene] jtibshirani merged pull request #164: LUCENE-9905: Allow Lucene90Codec to be configured with a per-field vector format

Posted by GitBox <gi...@apache.org>.

jtibshirani merged pull request #164:
URL: https://github.com/apache/lucene/pull/164


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org