You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Chris Male (JIRA)" <ji...@apache.org> on 2009/07/03 15:24:47 UTC

[jira] Created: (LUCENE-1732) Multi-threaded Spatial Search

Multi-threaded Spatial Search
-----------------------------

                 Key: LUCENE-1732
                 URL: https://issues.apache.org/jira/browse/LUCENE-1732
             Project: Lucene - Java
          Issue Type: Improvement
          Components: contrib/spatial
    Affects Versions: 2.9
            Reporter: Chris Male


The attached patch is a large refactoring of the spatial search contrib.  The primary contribution is the creation of the ThreadedDistanceFilter, which uses an ExecutorService to filter the documents in multiple threads.  As a result of doing the filtering in multiple threads, the time taken to filter 1.2 million documents has been reduced from nearly 3s, to between 500-800ms.

As part of this work, the DistanceQueryBuilder has been replaced by the SpatialFilter, a Lucene Filter, some unused functionality has been removed, and the package hierarchy has changed.  Consequently this patch breaks backwards compatibility with the existing spatial search contrib.

Also during the process of making these changes, abstractions have been added so that the one implementation of the ThreadedDistanceFilter can work with lat/long and geohash data formats, and so that precise but costly arc distance calculations can be replaced by less precise but much more efficient flat plane calculations if needed.

This patch will be used in an upcoming patch for Solr which will improve Solr's support for spatial search.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1732) Multi-threaded Spatial Search

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726997#action_12726997 ] 

Uwe Schindler commented on LUCENE-1732:
---------------------------------------

This is only a couple of days until this is part of Solr trunk. This is a case about Lucene 2.9, so it should use Lucene 2.9 things. This contrib may be part of Solr some time in the future but until then the Lucene JARs will also be updated.

I only mention this, as I will do the change to NumericUtils in very near future (when I have time to do it). NumericUtils has also the important advantage, that it is natively supported by Lucene's FieldCache (you can do e.g. FieldCache.getFloats() on such a field), so calculations on such fields could also be done using the FieldCache.

The Lucene people have no problem with changing the API nor the index format, as this was not yet released, so there is no backwards problem (only for people already using another version of LocalLucene).

Uwe

> Multi-threaded Spatial Search
> -----------------------------
>
>                 Key: LUCENE-1732
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1732
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/spatial
>    Affects Versions: 2.9
>            Reporter: Chris Male
>         Attachments: LUCENE-1732_multi_threaded_spatial_search.patch
>
>
> The attached patch is a large refactoring of the spatial search contrib.  The primary contribution is the creation of the ThreadedDistanceFilter, which uses an ExecutorService to filter the documents in multiple threads.  As a result of doing the filtering in multiple threads, the time taken to filter 1.2 million documents has been reduced from nearly 3s, to between 500-800ms.
> As part of this work, the DistanceQueryBuilder has been replaced by the SpatialFilter, a Lucene Filter, some unused functionality has been removed, and the package hierarchy has changed.  Consequently this patch breaks backwards compatibility with the existing spatial search contrib.
> Also during the process of making these changes, abstractions have been added so that the one implementation of the ThreadedDistanceFilter can work with lat/long and geohash data formats, and so that precise but costly arc distance calculations can be replaced by less precise but much more efficient flat plane calculations if needed.
> This patch will be used in an upcoming patch for Solr which will improve Solr's support for spatial search.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1732) Multi-threaded Spatial Search

Posted by "David Smiley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838544#action_12838544 ] 

David Smiley commented on LUCENE-1732:
--------------------------------------

If I have a machine with say four CPU cores also running Solr with four cores (a distributed -- i.e. sharded index), would it be fair to say that the optimization presented in this issue is of no use?

> Multi-threaded Spatial Search
> -----------------------------
>
>                 Key: LUCENE-1732
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1732
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/spatial
>    Affects Versions: 2.9
>            Reporter: Chris Male
>         Attachments: LUCENE-1732_multi_threaded_spatial_search.patch
>
>
> The attached patch is a large refactoring of the spatial search contrib.  The primary contribution is the creation of the ThreadedDistanceFilter, which uses an ExecutorService to filter the documents in multiple threads.  As a result of doing the filtering in multiple threads, the time taken to filter 1.2 million documents has been reduced from nearly 3s, to between 500-800ms.
> As part of this work, the DistanceQueryBuilder has been replaced by the SpatialFilter, a Lucene Filter, some unused functionality has been removed, and the package hierarchy has changed.  Consequently this patch breaks backwards compatibility with the existing spatial search contrib.
> Also during the process of making these changes, abstractions have been added so that the one implementation of the ThreadedDistanceFilter can work with lat/long and geohash data formats, and so that precise but costly arc distance calculations can be replaced by less precise but much more efficient flat plane calculations if needed.
> This patch will be used in an upcoming patch for Solr which will improve Solr's support for spatial search.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1732) Multi-threaded Spatial Search

Posted by "Chris Male (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726986#action_12726986 ] 

Chris Male commented on LUCENE-1732:
------------------------------------

Unfortunately Solr is not using a version of Lucene which has the NumericUtils class at this point and I would like for this patch to be usable with Solr.  When Solr does update to a version of Lucene which does include NumericUtils, I will update my patch to use the class.

> Multi-threaded Spatial Search
> -----------------------------
>
>                 Key: LUCENE-1732
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1732
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/spatial
>    Affects Versions: 2.9
>            Reporter: Chris Male
>         Attachments: LUCENE-1732_multi_threaded_spatial_search.patch
>
>
> The attached patch is a large refactoring of the spatial search contrib.  The primary contribution is the creation of the ThreadedDistanceFilter, which uses an ExecutorService to filter the documents in multiple threads.  As a result of doing the filtering in multiple threads, the time taken to filter 1.2 million documents has been reduced from nearly 3s, to between 500-800ms.
> As part of this work, the DistanceQueryBuilder has been replaced by the SpatialFilter, a Lucene Filter, some unused functionality has been removed, and the package hierarchy has changed.  Consequently this patch breaks backwards compatibility with the existing spatial search contrib.
> Also during the process of making these changes, abstractions have been added so that the one implementation of the ThreadedDistanceFilter can work with lat/long and geohash data formats, and so that precise but costly arc distance calculations can be replaced by less precise but much more efficient flat plane calculations if needed.
> This patch will be used in an upcoming patch for Solr which will improve Solr's support for spatial search.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1732) Multi-threaded Spatial Search

Posted by "Chris Male (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Male updated LUCENE-1732:
-------------------------------

    Attachment: LUCENE-1732_multi_threaded_spatial_search.patch

> Multi-threaded Spatial Search
> -----------------------------
>
>                 Key: LUCENE-1732
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1732
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/spatial
>    Affects Versions: 2.9
>            Reporter: Chris Male
>         Attachments: LUCENE-1732_multi_threaded_spatial_search.patch
>
>
> The attached patch is a large refactoring of the spatial search contrib.  The primary contribution is the creation of the ThreadedDistanceFilter, which uses an ExecutorService to filter the documents in multiple threads.  As a result of doing the filtering in multiple threads, the time taken to filter 1.2 million documents has been reduced from nearly 3s, to between 500-800ms.
> As part of this work, the DistanceQueryBuilder has been replaced by the SpatialFilter, a Lucene Filter, some unused functionality has been removed, and the package hierarchy has changed.  Consequently this patch breaks backwards compatibility with the existing spatial search contrib.
> Also during the process of making these changes, abstractions have been added so that the one implementation of the ThreadedDistanceFilter can work with lat/long and geohash data formats, and so that precise but costly arc distance calculations can be replaced by less precise but much more efficient flat plane calculations if needed.
> This patch will be used in an upcoming patch for Solr which will improve Solr's support for spatial search.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Issue Comment Edited: (LUCENE-1732) Multi-threaded Spatial Search

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726965#action_12726965 ] 

Uwe Schindler edited comment on LUCENE-1732 at 7/3/09 6:44 AM:
---------------------------------------------------------------

As it breaks backwards compatibility, could you use o.a.l.util.NumericUtils instead of NumberUtils and remove the class? Using this new class, all numeric values use the same encoding coming from Lucene core. This is part of a new approach for numeric range queries (see also SOLR-940), but the number format can also be used for Spatial's Tier encoding.

I could then close LUCENE-1505.

      was (Author: thetaphi):
    As it breaks backwards compatibility, could you use o.a.l.util.NumericUtils instead of NumberUtils and remove the class. Using this class, all numeric values use the same encoding coming from Lucene core. This is a new approach for numeric range queries, but the number format can also be used for Spatial's Tier encoding.

I could then close LUCENE-1505.
  
> Multi-threaded Spatial Search
> -----------------------------
>
>                 Key: LUCENE-1732
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1732
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/spatial
>    Affects Versions: 2.9
>            Reporter: Chris Male
>         Attachments: LUCENE-1732_multi_threaded_spatial_search.patch
>
>
> The attached patch is a large refactoring of the spatial search contrib.  The primary contribution is the creation of the ThreadedDistanceFilter, which uses an ExecutorService to filter the documents in multiple threads.  As a result of doing the filtering in multiple threads, the time taken to filter 1.2 million documents has been reduced from nearly 3s, to between 500-800ms.
> As part of this work, the DistanceQueryBuilder has been replaced by the SpatialFilter, a Lucene Filter, some unused functionality has been removed, and the package hierarchy has changed.  Consequently this patch breaks backwards compatibility with the existing spatial search contrib.
> Also during the process of making these changes, abstractions have been added so that the one implementation of the ThreadedDistanceFilter can work with lat/long and geohash data formats, and so that precise but costly arc distance calculations can be replaced by less precise but much more efficient flat plane calculations if needed.
> This patch will be used in an upcoming patch for Solr which will improve Solr's support for spatial search.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1732) Multi-threaded Spatial Search

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726965#action_12726965 ] 

Uwe Schindler commented on LUCENE-1732:
---------------------------------------

As it breaks backwards compatibility, could you use o.a.l.util.NumericUtils instead of NumberUtils and remove the class. Using this class, all numeric values use the same encoding coming from Lucene core. This is a new approach for numeric range queries, but the number format can also be used for Spatial's Tier encoding.

I could then close LUCENE-1505.

> Multi-threaded Spatial Search
> -----------------------------
>
>                 Key: LUCENE-1732
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1732
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/spatial
>    Affects Versions: 2.9
>            Reporter: Chris Male
>         Attachments: LUCENE-1732_multi_threaded_spatial_search.patch
>
>
> The attached patch is a large refactoring of the spatial search contrib.  The primary contribution is the creation of the ThreadedDistanceFilter, which uses an ExecutorService to filter the documents in multiple threads.  As a result of doing the filtering in multiple threads, the time taken to filter 1.2 million documents has been reduced from nearly 3s, to between 500-800ms.
> As part of this work, the DistanceQueryBuilder has been replaced by the SpatialFilter, a Lucene Filter, some unused functionality has been removed, and the package hierarchy has changed.  Consequently this patch breaks backwards compatibility with the existing spatial search contrib.
> Also during the process of making these changes, abstractions have been added so that the one implementation of the ThreadedDistanceFilter can work with lat/long and geohash data formats, and so that precise but costly arc distance calculations can be replaced by less precise but much more efficient flat plane calculations if needed.
> This patch will be used in an upcoming patch for Solr which will improve Solr's support for spatial search.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org