You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by dw...@apache.org on 2018/05/25 10:23:33 UTC

[1/3] lucene-solr:master: LUCENE-8221: MoreLikeThis.setMaxDocFreqPct can easily int-overflow on larger indexes.

Repository: lucene-solr
Updated Branches:
  refs/heads/branch_7x d2e9ad200 -> 300ab00b4
  refs/heads/master 41ecad989 -> 7a5d9ca5e


LUCENE-8221: MoreLikeThis.setMaxDocFreqPct can easily int-overflow on larger indexes.


Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/719fce80
Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/719fce80
Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/719fce80

Branch: refs/heads/master
Commit: 719fce80269706cf5c6f091cc71e39620f7b6cb2
Parents: 41ecad9
Author: Dawid Weiss <dw...@apache.org>
Authored: Fri May 25 12:16:22 2018 +0200
Committer: Dawid Weiss <dw...@apache.org>
Committed: Fri May 25 12:16:22 2018 +0200

----------------------------------------------------------------------
 lucene/CHANGES.txt                                           | 3 +++
 .../src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java | 8 +++++---
 2 files changed, 8 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/719fce80/lucene/CHANGES.txt
----------------------------------------------------------------------
diff --git a/lucene/CHANGES.txt b/lucene/CHANGES.txt
index 1b2ba84..d1f5395 100644
--- a/lucene/CHANGES.txt
+++ b/lucene/CHANGES.txt
@@ -202,6 +202,9 @@ New Features
 
 Bug Fixes
 
+* LUCENE-8221: MoreLikeThis.setMaxDocFreqPct can easily int-overflow on larger
+  indexes.
+
 * LUCENE-8266: Detect bogus tiles when creating a standard polygon and
   throw a TileException. (Ignacio Vera)
 

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/719fce80/lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java
----------------------------------------------------------------------
diff --git a/lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java b/lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java
index 9afa687..e145f74 100644
--- a/lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java
+++ b/lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java
@@ -418,14 +418,16 @@ public final class MoreLikeThis {
    * Set the maximum percentage in which words may still appear. Words that appear
    * in more than this many percent of all docs will be ignored.
    *
+   * This method calls {@link #setMaxDocFreq(int)} internally (both conditions cannot
+   * be used at the same time).
+   *
    * @param maxPercentage the maximum percentage of documents (0-100) that a term may appear
-   * in to be still considered relevant
+   * in to be still considered relevant.
    */
   public void setMaxDocFreqPct(int maxPercentage) {
-    this.maxDocFreq = maxPercentage * ir.numDocs() / 100;
+    setMaxDocFreq(Math.toIntExact((long) maxPercentage * ir.numDocs() / 100));
   }
 
-
   /**
    * Returns whether to boost terms in query based on "score" or not. The default is
    * {@link #DEFAULT_BOOST}.


[3/3] lucene-solr:master: LUCENE-8333: Switch MoreLikeThis.setMaxDocFreqPct to use maxDoc instead of numDoc

Posted by dw...@apache.org.
LUCENE-8333: Switch MoreLikeThis.setMaxDocFreqPct to use maxDoc instead of numDoc


Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/7a5d9ca5
Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/7a5d9ca5
Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/7a5d9ca5

Branch: refs/heads/master
Commit: 7a5d9ca5e880425cea1d67035c6be0122ce8dd5b
Parents: 719fce8
Author: Dawid Weiss <dw...@apache.org>
Authored: Fri May 25 12:22:21 2018 +0200
Committer: Dawid Weiss <dw...@apache.org>
Committed: Fri May 25 12:22:21 2018 +0200

----------------------------------------------------------------------
 lucene/CHANGES.txt                                                | 3 +++
 .../src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java      | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/7a5d9ca5/lucene/CHANGES.txt
----------------------------------------------------------------------
diff --git a/lucene/CHANGES.txt b/lucene/CHANGES.txt
index d1f5395..b643af1 100644
--- a/lucene/CHANGES.txt
+++ b/lucene/CHANGES.txt
@@ -49,6 +49,9 @@ API Changes
 
 Changes in Runtime Behavior
 
+* LUCENE-8333: Switch MoreLikeThis.setMaxDocFreqPct to use maxDoc instead of
+  numDocs. (Robert Muir, Dawid Weiss).
+
 * LUCENE-7837: Indices that were created before the previous major version
   will now fail to open even if they have been merged with the previous major
   version. (Adrien Grand)

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/7a5d9ca5/lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java
----------------------------------------------------------------------
diff --git a/lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java b/lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java
index e145f74..8ea3933 100644
--- a/lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java
+++ b/lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java
@@ -425,7 +425,7 @@ public final class MoreLikeThis {
    * in to be still considered relevant.
    */
   public void setMaxDocFreqPct(int maxPercentage) {
-    setMaxDocFreq(Math.toIntExact((long) maxPercentage * ir.numDocs() / 100));
+    setMaxDocFreq(Math.toIntExact((long) maxPercentage * ir.maxDoc() / 100));
   }
 
   /**


[2/3] lucene-solr:branch_7x: LUCENE-8221: MoreLikeThis.setMaxDocFreqPct can easily int-overflow on larger indexes.

Posted by dw...@apache.org.
LUCENE-8221: MoreLikeThis.setMaxDocFreqPct can easily int-overflow on larger indexes.


Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/300ab00b
Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/300ab00b
Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/300ab00b

Branch: refs/heads/branch_7x
Commit: 300ab00b4c75810a591750c785d47fcba2d65ebb
Parents: d2e9ad2
Author: Dawid Weiss <dw...@apache.org>
Authored: Fri May 25 12:16:22 2018 +0200
Committer: Dawid Weiss <dw...@apache.org>
Committed: Fri May 25 12:18:11 2018 +0200

----------------------------------------------------------------------
 lucene/CHANGES.txt                                           | 3 +++
 .../src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java | 8 +++++---
 2 files changed, 8 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/300ab00b/lucene/CHANGES.txt
----------------------------------------------------------------------
diff --git a/lucene/CHANGES.txt b/lucene/CHANGES.txt
index 06c12cc..0fa942c 100644
--- a/lucene/CHANGES.txt
+++ b/lucene/CHANGES.txt
@@ -101,6 +101,9 @@ New Features
 
 Bug Fixes
 
+* LUCENE-8221: MoreLikeThis.setMaxDocFreqPct can easily int-overflow on larger
+  indexes.
+
 * LUCENE-8266: Detect bogus tiles when creating a standard polygon
   and throw a TileException. (Ignacio Vera)
 

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/300ab00b/lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java
----------------------------------------------------------------------
diff --git a/lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java b/lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java
index ea02af3..22a2291 100644
--- a/lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java
+++ b/lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java
@@ -418,14 +418,16 @@ public final class MoreLikeThis {
    * Set the maximum percentage in which words may still appear. Words that appear
    * in more than this many percent of all docs will be ignored.
    *
+   * This method calls {@link #setMaxDocFreq(int)} internally (both conditions cannot
+   * be used at the same time).
+   *
    * @param maxPercentage the maximum percentage of documents (0-100) that a term may appear
-   * in to be still considered relevant
+   * in to be still considered relevant.
    */
   public void setMaxDocFreqPct(int maxPercentage) {
-    this.maxDocFreq = maxPercentage * ir.numDocs() / 100;
+    setMaxDocFreq(Math.toIntExact((long) maxPercentage * ir.numDocs() / 100));
   }
 
-
   /**
    * Returns whether to boost terms in query based on "score" or not. The default is
    * {@link #DEFAULT_BOOST}.