You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by Kelly Jones <ke...@gmail.com> on 2009/06/24 05:52:48 UTC
Using Lucene to index OSM nodes (400M latitude/longitude points)
Can Lucene index the openstreetmap.org (OSM) node db (400M
latitude/longitude pairs), and then find the 20 nodes closest to a
given latitude/longitude?
More specifically:
% Can Lucene index numerical data and understand that 16 is close to
15, but far away from 160000?
% Is Lucene reasonably fast indexing 400M floating point pairs?
% After Lucene creates the 400M index, can it return search results
reasonably fast?
% Is there a guide/tutorial that shows how to use Lucene to index
numerical data (I'm using Plucene, but I'll settle for any sort of
guide)?
I tried to index OSM data w/ SQLite3, but it took forever.
I realize I could use MySQL/PostgreSQL, but I'm looking for an
embedded/serverless solution.
--
We're just a Bunch Of Regular Guys, a collective group that's trying
to understand and assimilate technology. We feel that resistance to
new ideas and technology is unwise and ultimately futile.
Re: Using Lucene to index OSM nodes (400M latitude/longitude points)
Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi Kelly,
I think you want to look at LocalLucene (or LocalSolr). I haven't played with Local*, so I can't provide more than this tip. Actually, I can also suggest to dump Plucene - it's a dead project, and even when it was alive it was quite slow. If you really need to be able to search from a Perl application, your best bet may be using a Perl Solr client.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ----
> From: Kelly Jones <ke...@gmail.com>
> To: general@lucene.apache.org
> Sent: Tuesday, June 23, 2009 11:52:48 PM
> Subject: Using Lucene to index OSM nodes (400M latitude/longitude points)
>
> Can Lucene index the openstreetmap.org (OSM) node db (400M
> latitude/longitude pairs), and then find the 20 nodes closest to a
> given latitude/longitude?
>
> More specifically:
>
> % Can Lucene index numerical data and understand that 16 is close to
> 15, but far away from 160000?
>
> % Is Lucene reasonably fast indexing 400M floating point pairs?
>
> % After Lucene creates the 400M index, can it return search results
> reasonably fast?
>
> % Is there a guide/tutorial that shows how to use Lucene to index
> numerical data (I'm using Plucene, but I'll settle for any sort of
> guide)?
>
> I tried to index OSM data w/ SQLite3, but it took forever.
>
> I realize I could use MySQL/PostgreSQL, but I'm looking for an
> embedded/serverless solution.
>
> --
> We're just a Bunch Of Regular Guys, a collective group that's trying
> to understand and assimilate technology. We feel that resistance to
> new ideas and technology is unwise and ultimately futile.
Re: Using Lucene to index OSM nodes (400M latitude/longitude points)
Posted by Ted Dunning <te...@gmail.com>.
On Tue, Jun 23, 2009 at 8:52 PM, Kelly Jones <ke...@gmail.com>wrote:
> Can Lucene index the openstreetmap.org (OSM) node db (400M
> latitude/longitude pairs), and then find the 20 nodes closest to a
> given latitude/longitude?
>
Probably. (isn't that definitive!)
% Can Lucene index numerical data and understand that 16 is close to
> 15, but far away from 160000?
>
As of the recent versions, yes.
> % Is Lucene reasonably fast indexing 400M floating point pairs?
>
Should be. I have no experience with this kind of indexing. You will need
a pretty good sized amount of memory or use a sharding system like katta.
>
> % After Lucene creates the 400M index, can it return search results
> reasonably fast?
>
With the right level of parallelism, absolutely.
> % Is there a guide/tutorial that shows how to use Lucene to index
> numerical data (I'm using Plucene, but I'll settle for any sort of
> guide)?
>
Not really. Efficient numerical search is relatively new in Lucene:
See the slide shows on Michael Busch's linked-in profile:
http://www.linkedin.com/profile?viewProfile=&key=12809985&authToken=-hrn&authType=name
Also, see here:
http://wiki.apache.org/lucene-java/SearchNumericalFields
--
Ted Dunning, CTO
DeepDyve
111 West Evelyn Ave. Ste. 202
Sunnyvale, CA 94086
http://www.deepdyve.com
858-414-0013 (m)
408-773-0220 (fax)