You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by Kelly Jones <ke...@gmail.com> on 2009/06/24 05:52:48 UTC

Using Lucene to index OSM nodes (400M latitude/longitude points)

Can Lucene index the openstreetmap.org (OSM) node db (400M
latitude/longitude pairs), and then find the 20 nodes closest to a
given latitude/longitude?

More specifically:

 % Can Lucene index numerical data and understand that 16 is close to
 15, but far away from 160000?

 % Is Lucene reasonably fast indexing 400M floating point pairs?

 % After Lucene creates the 400M index, can it return search results
 reasonably fast?

 % Is there a guide/tutorial that shows how to use Lucene to index
 numerical data (I'm using Plucene, but I'll settle for any sort of
 guide)?

I tried to index OSM data w/ SQLite3, but it took forever.

I realize I could use MySQL/PostgreSQL, but I'm looking for an
embedded/serverless solution.

-- 
We're just a Bunch Of Regular Guys, a collective group that's trying
to understand and assimilate technology. We feel that resistance to
new ideas and technology is unwise and ultimately futile.

Re: Using Lucene to index OSM nodes (400M latitude/longitude points)

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi Kelly,

I think you want to look at LocalLucene (or LocalSolr).  I haven't played with Local*, so I can't provide more than this tip.  Actually, I can also suggest to dump Plucene - it's a dead project, and even when it was alive it was quite slow.  If you really need to be able to search from a Perl application, your best bet may be using a Perl Solr client.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Kelly Jones <ke...@gmail.com>
> To: general@lucene.apache.org
> Sent: Tuesday, June 23, 2009 11:52:48 PM
> Subject: Using Lucene to index OSM nodes (400M latitude/longitude points)
> 
> Can Lucene index the openstreetmap.org (OSM) node db (400M
> latitude/longitude pairs), and then find the 20 nodes closest to a
> given latitude/longitude?
> 
> More specifically:
> 
> % Can Lucene index numerical data and understand that 16 is close to
> 15, but far away from 160000?
> 
> % Is Lucene reasonably fast indexing 400M floating point pairs?
> 
> % After Lucene creates the 400M index, can it return search results
> reasonably fast?
> 
> % Is there a guide/tutorial that shows how to use Lucene to index
> numerical data (I'm using Plucene, but I'll settle for any sort of
> guide)?
> 
> I tried to index OSM data w/ SQLite3, but it took forever.
> 
> I realize I could use MySQL/PostgreSQL, but I'm looking for an
> embedded/serverless solution.
> 
> -- 
> We're just a Bunch Of Regular Guys, a collective group that's trying
> to understand and assimilate technology. We feel that resistance to
> new ideas and technology is unwise and ultimately futile.


Re: Using Lucene to index OSM nodes (400M latitude/longitude points)

Posted by Ted Dunning <te...@gmail.com>.
On Tue, Jun 23, 2009 at 8:52 PM, Kelly Jones <ke...@gmail.com>wrote:

> Can Lucene index the openstreetmap.org (OSM) node db (400M
> latitude/longitude pairs), and then find the 20 nodes closest to a
> given latitude/longitude?
>

Probably.  (isn't that definitive!)

 % Can Lucene index numerical data and understand that 16 is close to
>  15, but far away from 160000?
>

As of the recent versions, yes.


> % Is Lucene reasonably fast indexing 400M floating point pairs?
>

Should be.  I have no experience with this kind of indexing.  You will need
a pretty good sized amount of memory or use a sharding system like katta.


>
> % After Lucene creates the 400M index, can it return search results
>  reasonably fast?
>

With the right level of parallelism, absolutely.


> % Is there a guide/tutorial that shows how to use Lucene to index
>  numerical data (I'm using Plucene, but I'll settle for any sort of
>  guide)?
>

Not really.  Efficient numerical search is relatively new in Lucene:

See the slide shows on Michael Busch's linked-in profile:
http://www.linkedin.com/profile?viewProfile=&key=12809985&authToken=-hrn&authType=name

Also, see here:
http://wiki.apache.org/lucene-java/SearchNumericalFields



-- 
Ted Dunning, CTO
DeepDyve

111 West Evelyn Ave. Ste. 202
Sunnyvale, CA 94086
http://www.deepdyve.com
858-414-0013 (m)
408-773-0220 (fax)