You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Guillermo Payet <gp...@localharvest.org> on 2010/03/28 00:15:48 UTC

Lucene Spatial

Hello,

We've been using locallucene for years and years in our search engine of
family farms: http://www.localharvest.org/

We'd like to upgrade Lucene to 3.0.1, which also means migrating from
locallucene to lucene-spatial. However, Lucene spatial seems very different
from locallucene, and I can't even find any documentation on how to use 
it!  Can someone tell me if lucene-spatial is production ready?  Maybe point
me to some docs?  Is there any sample code for a basic lucene search using it?

This is the best i've found so far:

http://www.mail-archive.com/java-dev@lucene.apache.org/msg36727.html

Thanks

    --G



-- 
Guillermo Payet
L O C A L H A R V E S T
http://www.localharvest.org
http://twitter.com/localharvestorg

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene Spatial

Posted by Guillermo Payet <gp...@localharvest.org>.
Hello Grant,

> Is there a specific thing you are having a problem with? 

In LocalLucene, we use DistanceSortSource for sorting, which is no more.
How do we do distance sorting in Spatial?  

Also, we use BoundaryBoxFilter to show all points visible in a zoomable 
map.  This class is no longer there either.  What replaces it?

Thanks.

    --G




On Tue, Mar 30, 2010 at 03:48:13PM -0400, Grant Ingersoll wrote:
> Hi Guillermo,
> 
> I think you will find that Lucene Spatial is going under a significant rewrite in the coming weeks, so I'm hesitant to recommend you upgrading to it at this point in time.  That being said, the concepts behind Lucene Spatial and LocalLucene aren't all that different, so it should be fairly easy to upgrade.  Is there a specific thing you are having a problem with?  That might be easier to get a handle on then the general question of how to do this.
> 
> -Grant
> 
> On Mar 30, 2010, at 11:55 AM, Guillermo Payet wrote:
> 
> > I had seen those, but they don't quite help on our problem of migrating from 
> > LocalLucene 2.0 to Lucene Spatial.  We don't use Solr.
> > 
> > Thanks though.
> > 
> > Anybody using Lucene Spatial, without Solr, wanting to share some code for
> > a basic geographical search?
> > 
> >    --G
> > 
> > 
> > On Tue, Mar 30, 2010 at 03:05:43PM +0200, Isabel Drost wrote:
> >> On Sat Guillermo Payet <gp...@localharvest.org> wrote:
> >>> Maybe point me to some docs?  Is there any sample
> >>> code for a basic lucene search using it?
> >> 
> >> A quick Google search for lucene spatial revealed the following
> >> articles:
> >> 
> >> http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene_v2.html
> >> http://blog.jteam.nl/2009/08/03/geo-location-search-with-solr-and-lucene/
> >> 
> >> Hope that helps.
> >> 
> >> 
> >> There is also a video of Chris Male explaining Lucene Spatial from a
> >> meetup some time ago in Berlin:
> >> 
> >> http://vimeo.com/10204365
> >> 
> >> 
> >> Cheers,
> >> Isabel
> >> 
> >> 
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> 
> > 
> > -- 
> > Guillermo Payet
> > L O C A L H A R V E S T
> > http://www.localharvest.org
> > http://twitter.com/localharvestorg
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> > 
> 
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
> 
> Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

-- 
Guillermo Payet
L O C A L H A R V E S T
http://www.localharvest.org
http://twitter.com/localharvestorg

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene Spatial

Posted by Grant Ingersoll <gs...@apache.org>.
Hi Guillermo,

I think you will find that Lucene Spatial is going under a significant rewrite in the coming weeks, so I'm hesitant to recommend you upgrading to it at this point in time.  That being said, the concepts behind Lucene Spatial and LocalLucene aren't all that different, so it should be fairly easy to upgrade.  Is there a specific thing you are having a problem with?  That might be easier to get a handle on then the general question of how to do this.

-Grant

On Mar 30, 2010, at 11:55 AM, Guillermo Payet wrote:

> I had seen those, but they don't quite help on our problem of migrating from 
> LocalLucene 2.0 to Lucene Spatial.  We don't use Solr.
> 
> Thanks though.
> 
> Anybody using Lucene Spatial, without Solr, wanting to share some code for
> a basic geographical search?
> 
>    --G
> 
> 
> On Tue, Mar 30, 2010 at 03:05:43PM +0200, Isabel Drost wrote:
>> On Sat Guillermo Payet <gp...@localharvest.org> wrote:
>>> Maybe point me to some docs?  Is there any sample
>>> code for a basic lucene search using it?
>> 
>> A quick Google search for lucene spatial revealed the following
>> articles:
>> 
>> http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene_v2.html
>> http://blog.jteam.nl/2009/08/03/geo-location-search-with-solr-and-lucene/
>> 
>> Hope that helps.
>> 
>> 
>> There is also a video of Chris Male explaining Lucene Spatial from a
>> meetup some time ago in Berlin:
>> 
>> http://vimeo.com/10204365
>> 
>> 
>> Cheers,
>> Isabel
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
> 
> -- 
> Guillermo Payet
> L O C A L H A R V E S T
> http://www.localharvest.org
> http://twitter.com/localharvestorg
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene Spatial

Posted by Guillermo Payet <gp...@localharvest.org>.
I had seen those, but they don't quite help on our problem of migrating from 
LocalLucene 2.0 to Lucene Spatial.  We don't use Solr.

Thanks though.

Anybody using Lucene Spatial, without Solr, wanting to share some code for
a basic geographical search?

    --G


On Tue, Mar 30, 2010 at 03:05:43PM +0200, Isabel Drost wrote:
> On Sat Guillermo Payet <gp...@localharvest.org> wrote:
> > Maybe point me to some docs?  Is there any sample
> > code for a basic lucene search using it?
> 
> A quick Google search for lucene spatial revealed the following
> articles:
> 
> http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene_v2.html
> http://blog.jteam.nl/2009/08/03/geo-location-search-with-solr-and-lucene/
> 
> Hope that helps.
> 
> 
> There is also a video of Chris Male explaining Lucene Spatial from a
> meetup some time ago in Berlin:
> 
> http://vimeo.com/10204365
> 
> 
> Cheers,
> Isabel
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

-- 
Guillermo Payet
L O C A L H A R V E S T
http://www.localharvest.org
http://twitter.com/localharvestorg

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene Partition Size

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Thu, Apr 8, 2010 at 2:44 PM, Karl Wettin <ka...@gmail.com> wrote:
>
> 8 apr 2010 kl. 20.05 skrev Ivan Provalov:
>
>> We are using Lucene for searching of 200+ mln documents (periodical
>> publications).  Is there any limitation on the size of the Lucene index
>> (file size, number of docs, etc...)?
>
> The only such limitation in Lucene I'm aware of is Integer.MAX_VALUE
> documents. This might also be true for number of terms.

Actually max number of terms is 128 * MAX_INT (ie ~256 B).

That 128 is the terms index interval, so if you eg set the index
divisor to 2 it'll double the max number of terms.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene Partition Size

Posted by Ivan Provalov <ip...@yahoo.com>.
Thank you, Karl!

--- On Fri, 4/9/10, Karl Wettin <ka...@gmail.com> wrote:

> From: Karl Wettin <ka...@gmail.com>
> Subject: Re: Lucene Partition Size
> To: java-user@lucene.apache.org
> Date: Friday, April 9, 2010, 9:39 AM
> It's hard for me to say why this is
> slow.
> 
> Here are a few more questions whose anwers might provide
> further clues:
> 
> What was the reasons that led you to partition the index
> this way?
> What does the searcher implementation look like?
> What would a typcial query sent to that searcher look
> like?
> 
> I would start by loading the full 400Gb as a single index
> on local disc.
> 
> 
>     karl
> 
> 8 apr 2010 kl. 22.07 skrev Ivan Provalov:
> 
> > Karl,
> >
> > We have not done the same scale local-disk test. 
> Our network  
> > parameters are
> >
> > -  Network speed:  1gb
> > -  3 partitions per volume
> > -  The volumes are accessed via NFS to EMC Celera
> devices. (NFS 3)
> > -  The drives are 300 gb fiber attached with
> 10,000 rpm.
> >
> > Thanks,
> >
> > Ivan
> >
> > --- On Thu, 4/8/10, Karl Wettin <ka...@gmail.com>
> wrote:
> >
> >> From: Karl Wettin <ka...@gmail.com>
> >> Subject: Re: Lucene Partition Size
> >> To: java-user@lucene.apache.org
> >> Date: Thursday, April 8, 2010, 2:44 PM
> >>
> >> 8 apr 2010 kl. 20.05 skrev Ivan Provalov:
> >>
> >>> We are using Lucene for searching of 200+ mln
> >> documents (periodical publications).  Is
> there any
> >> limitation on the size of the Lucene index (file
> size,
> >> number of docs, etc...)?
> >>
> >> The only such limitation in Lucene I'm aware of
> is
> >> Integer.MAX_VALUE documents. This might also be
> true for
> >> number of terms.
> >>
> >>> We are partitioning the indexes at about 10
> mln
> >> documents per partition (each partition is on a
> separate
> >> box, some mounted volumes are shared among
> three-four
> >> partitions).  We have index size around 10%
> of the
> >> content size.  The total content is around
> 4Tb and the
> >> index around 400Gb.  This content is planned
> to be
> >> split into 20 partitions (10mln docs, 200Gb
> content size,
> >> 20Gb index size).  We are using a memory
> mapped index
> >> directory implementation.  Our testing is
> done with 600
> >> concurrent users.
> >>>
> >>> We are seeing consistently high response times
> from
> >> the partitions (4-5 seconds).  Is there a
> number of
> >> documents per partition limitation in Lucene for
> this
> >> particular scenario?
> >>
> >> I'm not sure if I got this right but it sounds
> like your
> >> index is mounted over network? Can you tell us
> some more
> >> details about that? What speeds do you see if you
> put the
> >> index on local disc?
> >>
> >>
> >>
> >>     karl
> >>
> >>
> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> >
> >
> >
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene Partition Size

Posted by Karl Wettin <ka...@gmail.com>.
It's hard for me to say why this is slow.

Here are a few more questions whose anwers might provide further clues:

What was the reasons that led you to partition the index this way?
What does the searcher implementation look like?
What would a typcial query sent to that searcher look like?

I would start by loading the full 400Gb as a single index on local disc.


	karl

8 apr 2010 kl. 22.07 skrev Ivan Provalov:

> Karl,
>
> We have not done the same scale local-disk test.  Our network  
> parameters are
>
> -  Network speed:  1gb
> -  3 partitions per volume
> -  The volumes are accessed via NFS to EMC Celera devices. (NFS 3)
> -  The drives are 300 gb fiber attached with 10,000 rpm.
>
> Thanks,
>
> Ivan
>
> --- On Thu, 4/8/10, Karl Wettin <ka...@gmail.com> wrote:
>
>> From: Karl Wettin <ka...@gmail.com>
>> Subject: Re: Lucene Partition Size
>> To: java-user@lucene.apache.org
>> Date: Thursday, April 8, 2010, 2:44 PM
>>
>> 8 apr 2010 kl. 20.05 skrev Ivan Provalov:
>>
>>> We are using Lucene for searching of 200+ mln
>> documents (periodical publications).  Is there any
>> limitation on the size of the Lucene index (file size,
>> number of docs, etc...)?
>>
>> The only such limitation in Lucene I'm aware of is
>> Integer.MAX_VALUE documents. This might also be true for
>> number of terms.
>>
>>> We are partitioning the indexes at about 10 mln
>> documents per partition (each partition is on a separate
>> box, some mounted volumes are shared among three-four
>> partitions).  We have index size around 10% of the
>> content size.  The total content is around 4Tb and the
>> index around 400Gb.  This content is planned to be
>> split into 20 partitions (10mln docs, 200Gb content size,
>> 20Gb index size).  We are using a memory mapped index
>> directory implementation.  Our testing is done with 600
>> concurrent users.
>>>
>>> We are seeing consistently high response times from
>> the partitions (4-5 seconds).  Is there a number of
>> documents per partition limitation in Lucene for this
>> particular scenario?
>>
>> I'm not sure if I got this right but it sounds like your
>> index is mounted over network? Can you tell us some more
>> details about that? What speeds do you see if you put the
>> index on local disc?
>>
>>
>>
>>     karl
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene Partition Size

Posted by Ivan Provalov <ip...@yahoo.com>.
Karl,

We have not done the same scale local-disk test.  Our network parameters are

-  Network speed:  1gb
-  3 partitions per volume
-  The volumes are accessed via NFS to EMC Celera devices. (NFS 3)
-  The drives are 300 gb fiber attached with 10,000 rpm.

Thanks,

Ivan

--- On Thu, 4/8/10, Karl Wettin <ka...@gmail.com> wrote:

> From: Karl Wettin <ka...@gmail.com>
> Subject: Re: Lucene Partition Size
> To: java-user@lucene.apache.org
> Date: Thursday, April 8, 2010, 2:44 PM
> 
> 8 apr 2010 kl. 20.05 skrev Ivan Provalov:
> 
> > We are using Lucene for searching of 200+ mln
> documents (periodical publications).  Is there any
> limitation on the size of the Lucene index (file size,
> number of docs, etc...)?
> 
> The only such limitation in Lucene I'm aware of is
> Integer.MAX_VALUE documents. This might also be true for
> number of terms.
> 
> > We are partitioning the indexes at about 10 mln
> documents per partition (each partition is on a separate
> box, some mounted volumes are shared among three-four
> partitions).  We have index size around 10% of the
> content size.  The total content is around 4Tb and the
> index around 400Gb.  This content is planned to be
> split into 20 partitions (10mln docs, 200Gb content size,
> 20Gb index size).  We are using a memory mapped index
> directory implementation.  Our testing is done with 600
> concurrent users.
> > 
> > We are seeing consistently high response times from
> the partitions (4-5 seconds).  Is there a number of
> documents per partition limitation in Lucene for this
> particular scenario?
> 
> I'm not sure if I got this right but it sounds like your
> index is mounted over network? Can you tell us some more
> details about that? What speeds do you see if you put the
> index on local disc?
> 
> 
> 
>     karl
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene Partition Size

Posted by Karl Wettin <ka...@gmail.com>.
8 apr 2010 kl. 20.05 skrev Ivan Provalov:

> We are using Lucene for searching of 200+ mln documents (periodical  
> publications).  Is there any limitation on the size of the Lucene  
> index (file size, number of docs, etc...)?

The only such limitation in Lucene I'm aware of is Integer.MAX_VALUE  
documents. This might also be true for number of terms.

> We are partitioning the indexes at about 10 mln documents per  
> partition (each partition is on a separate box, some mounted volumes  
> are shared among three-four partitions).  We have index size around  
> 10% of the content size.  The total content is around 4Tb and the  
> index around 400Gb.  This content is planned to be split into 20  
> partitions (10mln docs, 200Gb content size, 20Gb index size).  We  
> are using a memory mapped index directory implementation.  Our  
> testing is done with 600 concurrent users.
>
> We are seeing consistently high response times from the partitions  
> (4-5 seconds).  Is there a number of documents per partition  
> limitation in Lucene for this particular scenario?

I'm not sure if I got this right but it sounds like your index is  
mounted over network? Can you tell us some more details about that?  
What speeds do you see if you put the index on local disc?



	karl

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Lucene Partition Size

Posted by Ivan Provalov <ip...@yahoo.com>.
We are using Lucene for searching of 200+ mln documents (periodical publications).  Is there any limitation on the size of the Lucene index (file size, number of docs, etc...)?  

We are partitioning the indexes at about 10 mln documents per partition (each partition is on a separate box, some mounted volumes are shared among three-four partitions).  We have index size around 10% of the content size.  The total content is around 4Tb and the index around 400Gb.  This content is planned to be split into 20 partitions (10mln docs, 200Gb content size, 20Gb index size).  We are using a memory mapped index directory implementation.  Our testing is done with 600 concurrent users.  

We are seeing consistently high response times from the partitions (4-5 seconds).  Is there a number of documents per partition limitation in Lucene for this particular scenario?

Thanks,

Ivan


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene Spatial

Posted by Isabel Drost <is...@apache.org>.
On Sat Guillermo Payet <gp...@localharvest.org> wrote:
> Maybe point me to some docs?  Is there any sample
> code for a basic lucene search using it?

A quick Google search for lucene spatial revealed the following
articles:

http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene_v2.html
http://blog.jteam.nl/2009/08/03/geo-location-search-with-solr-and-lucene/

Hope that helps.


There is also a video of Chris Male explaining Lucene Spatial from a
meetup some time ago in Berlin:

http://vimeo.com/10204365


Cheers,
Isabel


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org