You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Stefan Dlugolinsky <s....@gmail.com> on 2009/03/03 05:08:38 UTC

Parsing, Indexing multiple values (of same type) per document - Nutch-0.9

Hello,

I'm writing a DistanceSearch plugin (something similar to GeoPosition  
plugin), which parses HTML pages and extracts geographic data  
(addresses, GPS, ...) from the text of the page. This geographic data  
is geocoded into latitude and longitude - values to be indexed and  
later used in search.

My problem is when parser finds more geographic positions (lat, lon)  
at once. I want to return each geocoded position to indexer and index  
document by every found position, so this document would be linked to  
more geographic locations. (I don't want to compute average location  
or something like that.)

Any suggestions will be appreciated.

Thank you.

S.D.