You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by Michael McCandless <mi...@elastic.co> on 2015/09/04 17:02:16 UTC

Apache Lucene Update 2015-09-04

Lucene summary for this week:

   - A 5.3.1 bugfix release may be coming soon
   <http://lucene.markmail.org/thread/d356iwsseh25agoj>

   - See the confusion matrix
   <https://issues.apache.org/jira/browse/LUCENE-6479> for a classifier

   - Remove a nasty classloader hack that broke MorfologikFilter
   <https://issues.apache.org/jira/browse/LUCENE-6774>, and fix it correctly
   <https://issues.apache.org/jira/browse/LUCENE-6775> so you can pass the
   dictionary as a URI

   - The new BoostQuery decouples Query from boosting
   <https://issues.apache.org/jira/browse/LUCENE-6590>, but it was missing
   its rewrite method <https://issues.apache.org/jira/browse/LUCENE-6781>

   - How can we compress postings payloads
   <https://issues.apache.org/jira/browse/LUCENE-6764>?

   - Nested conjunctions should always be flattened
   <https://issues.apache.org/jira/browse/LUCENE-6773>

   - MultiCollector did not handle early termination properly
   <https://issues.apache.org/jira/browse/LUCENE-6772>

   - Add a point-within-distance query
   <https://issues.apache.org/jira/browse/LUCENE-6698> implemented with BKD
   trees

   - Speed up IndexSearcher.count
   <https://issues.apache.org/jira/browse/LUCENE-6754>when a query is so
   simple (match all, single term) that we can use index statistics instead

   - The integration of BKDTree and Geo3D
   <https://issues.apache.org/jira/browse/LUCENE-6699>is done (for Lucene
   5.4.0), providing accurate and fast earth-surface "point in shape" queries,
   but we need to make its randomized tests more evil by simulating planets
   more squashed than earth
   <https://issues.apache.org/jira/browse/LUCENE-6776>, requiring somecrazy
   math
   <https://issues.apache.org/jira/browse/LUCENE-6776?focusedCommentId=14729991&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14729991>,
   including Lagrange multipliers
   <https://en.wikipedia.org/wiki/Lagrange_multiplier>

   - GeoPointDistanceQuery is buggy with large distances
   <https://issues.apache.org/jira/browse/LUCENE-6780>

   - Don't use approximations
   <https://issues.apache.org/jira/browse/LUCENE-6761> for MatchAllDocsQuery,
   and give it a dedicated BulkScorer
   <https://issues.apache.org/jira/browse/LUCENE-6756>

   - Dodge bugs in Java's collators
   <https://issues.apache.org/jira/browse/LUCENE-6206>

   - Windows NTFS pending delete state for a file
   <https://issues.apache.org/jira/browse/LUCENE-6684> causes assertion
   failures in Lucene; we should fix Lucene's WindowsFS to also simulate
   this state <https://issues.apache.org/jira/browse/LUCENE-6771>

   - Symlinks to an index directory continue to cause problems for users
   <https://issues.apache.org/jira/browse/LUCENE-6770>

   - CheckIndex cannot handle corrupt .si
   <https://issues.apache.org/jira/browse/LUCENE-6762>files

   - Reduce heap used by <https://issues.apache.org/jira/browse/LUCENE-6779>
    CompressingStoredFieldsWriter when writing large strings during
   indexing

   - Reduce heap used by
<https://issues.apache.org/jira/browse/LUCENE-6777> the
   new geo point queries by building the BytesRef on demand for sub-ranges

   - GeoPointDistanceRangeQuery will match points within a min/max distance
   range <https://issues.apache.org/jira/browse/LUCENE-6778>

   - When you incorrectly index nested documents the resulting error
   messages are very confusing
   <https://issues.apache.org/jira/browse/LUCENE-6660>

   - DisjunctionMaxQuery, BoostingQuery and BoostedQuery now use
   IndexSearcher to create sub-weights
   <https://issues.apache.org/jira/browse/LUCENE-6746> so caching can apply

Mike McCandless

Re: Apache Lucene Update 2015-09-04

Posted by Michael McCandless <lu...@mikemccandless.com>.

You're welcome!

Mike McCandless

http://blog.mikemccandless.com


On Fri, Sep 4, 2015 at 11:07 AM, david.w.smiley@gmail.com
<da...@gmail.com> wrote:
> Thanks for this Mike :-)
>
> On Fri, Sep 4, 2015 at 11:02 AM Michael McCandless <mi...@elastic.co> wrote:
>>
>> Lucene summary for this week:
>>
>> A 5.3.1 bugfix release may be coming soon
>>
>> See the confusion matrix for a classifier
>>
>> Remove a nasty classloader hack that broke MorfologikFilter , and fix it
>> correctly so you can pass the dictionary as a URI
>>
>> The new BoostQuery decouples Query from boosting, but it was missing its
>> rewrite method
>>
>> How can we compress postings payloads?
>>
>> Nested conjunctions should always be flattened
>>
>> MultiCollector did not handle early termination properly
>>
>> Add a point-within-distance query implemented with BKD trees
>>
>> Speed up IndexSearcher.count when a query is so simple (match all, single
>> term) that we can use index statistics instead
>>
>> The integration of BKDTree and Geo3D is done (for Lucene 5.4.0), providing
>> accurate and fast earth-surface "point in shape" queries, but we need to
>> make its randomized tests more evil by simulating planets more squashed than
>> earth, requiring somecrazy math, including Lagrange multipliers
>>
>> GeoPointDistanceQuery is buggy with large distances
>>
>> Don't use approximations for MatchAllDocsQuery, and give it a dedicated
>> BulkScorer
>>
>> Dodge bugs in Java's collators
>>
>> Windows NTFS pending delete state for a file causes assertion failures in
>> Lucene; we should fix Lucene's WindowsFS to also simulate this state
>>
>> Symlinks to an index directory continue to cause problems for users
>>
>> CheckIndex cannot handle corrupt .si files
>>
>> Reduce heap used by CompressingStoredFieldsWriter when writing large
>> strings during indexing
>>
>> Reduce heap used by the new geo point queries by building the BytesRef on
>> demand for sub-ranges
>>
>> GeoPointDistanceRangeQuery will match points within a min/max distance
>> range
>>
>> When you incorrectly index nested documents the resulting error messages
>> are very confusing
>>
>> DisjunctionMaxQuery, BoostingQuery and BoostedQuery now use IndexSearcher
>> to create sub-weights so caching can apply
>>
>> Mike McCandless
>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Apache Lucene Update 2015-09-04

Posted by "david.w.smiley@gmail.com" <da...@gmail.com>.

Thanks for this Mike :-)

On Fri, Sep 4, 2015 at 11:02 AM Michael McCandless <mi...@elastic.co> wrote:

> Lucene summary for this week:
>
>    - A 5.3.1 bugfix release may be coming soon
>    <http://lucene.markmail.org/thread/d356iwsseh25agoj>
>
>    - See the confusion matrix
>    <https://issues.apache.org/jira/browse/LUCENE-6479> for a classifier
>
>    - Remove a nasty classloader hack that broke MorfologikFilter
>    <https://issues.apache.org/jira/browse/LUCENE-6774>, and fix it
>    correctly <https://issues.apache.org/jira/browse/LUCENE-6775> so you
>    can pass the dictionary as a URI
>
>    - The new BoostQuery decouples Query from boosting
>    <https://issues.apache.org/jira/browse/LUCENE-6590>, but it was
>    missing its rewrite method
>    <https://issues.apache.org/jira/browse/LUCENE-6781>
>
>    - How can we compress postings payloads
>    <https://issues.apache.org/jira/browse/LUCENE-6764>?
>
>    - Nested conjunctions should always be flattened
>    <https://issues.apache.org/jira/browse/LUCENE-6773>
>
>    - MultiCollector did not handle early termination properly
>    <https://issues.apache.org/jira/browse/LUCENE-6772>
>
>    - Add a point-within-distance query
>    <https://issues.apache.org/jira/browse/LUCENE-6698> implemented with
>    BKD trees
>
>    - Speed up IndexSearcher.count
>    <https://issues.apache.org/jira/browse/LUCENE-6754>when a query is so
>    simple (match all, single term) that we can use index statistics instead
>
>    - The integration of BKDTree and Geo3D
>    <https://issues.apache.org/jira/browse/LUCENE-6699>is done (for Lucene
>    5.4.0), providing accurate and fast earth-surface "point in shape" queries,
>    but we need to make its randomized tests more evil by simulating
>    planets more squashed than earth
>    <https://issues.apache.org/jira/browse/LUCENE-6776>, requiring somecrazy
>    math
>    <https://issues.apache.org/jira/browse/LUCENE-6776?focusedCommentId=14729991&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14729991>,
>    including Lagrange multipliers
>    <https://en.wikipedia.org/wiki/Lagrange_multiplier>
>
>    - GeoPointDistanceQuery is buggy with large distances
>    <https://issues.apache.org/jira/browse/LUCENE-6780>
>
>    - Don't use approximations
>    <https://issues.apache.org/jira/browse/LUCENE-6761> for
>    MatchAllDocsQuery, and give it a dedicated BulkScorer
>    <https://issues.apache.org/jira/browse/LUCENE-6756>
>
>    - Dodge bugs in Java's collators
>    <https://issues.apache.org/jira/browse/LUCENE-6206>
>
>    - Windows NTFS pending delete state for a file
>    <https://issues.apache.org/jira/browse/LUCENE-6684> causes assertion
>    failures in Lucene; we should fix Lucene's WindowsFS to also simulate
>    this state <https://issues.apache.org/jira/browse/LUCENE-6771>
>
>    - Symlinks to an index directory continue to cause problems for users
>    <https://issues.apache.org/jira/browse/LUCENE-6770>
>
>    - CheckIndex cannot handle corrupt .si
>    <https://issues.apache.org/jira/browse/LUCENE-6762>files
>
>    - Reduce heap used by
>    <https://issues.apache.org/jira/browse/LUCENE-6779>
>    CompressingStoredFieldsWriter when writing large strings during
>    indexing
>
>    - Reduce heap used by
>    <https://issues.apache.org/jira/browse/LUCENE-6777> the new geo point
>    queries by building the BytesRef on demand for sub-ranges
>
>    - GeoPointDistanceRangeQuery will match points within a min/max
>    distance range <https://issues.apache.org/jira/browse/LUCENE-6778>
>
>    - When you incorrectly index nested documents the resulting error
>    messages are very confusing
>    <https://issues.apache.org/jira/browse/LUCENE-6660>
>
>    - DisjunctionMaxQuery, BoostingQuery and BoostedQuery now use
>    IndexSearcher to create sub-weights
>    <https://issues.apache.org/jira/browse/LUCENE-6746> so caching can
>    apply
>
> Mike McCandless
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com