You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Max S <ma...@googlemail.com> on 2009/09/03 07:54:23 UTC
Customise scoring
Hi all,
I'm have installed / imported a XML and EXIF parser plugin into Nutch to
parse xml files and EXIF metadata from JPG images.
The idea would be to:
1. Fetch and extract data and links from XML file
NB: The XML file contains Geo coordinates (latitude and longitude),
title and image links.
2. Fetch image and extract EXIF metadata 3. Store the extracted data from
both parser in Index.
I would like to customise search so the results is ordered by the following
priority.
1. Proximity to location
2. Keywords from EXIF Metadata
3. Kewords from XML title
>From what I can see at the moment, I will need to
1. Set a higher score to the fields according to the priority above
2. Repurpose the algorithm within GeoPosition plugin
(http://wiki.apache.org/nutch/GeoPosition)
3. Update ScoringFilter logic to include Geo Position algorithm?
Question, is the last item correct? Or are there any other approach?
Where should I start looking? Appreciate any suggestions.
Regards
Max S