You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Doug Cutting <cu...@nutch.org> on 2006/01/25 20:41:14 UTC
Re: Ideas for enhancements
Howie Wang wrote:
> 1. A String[] HitDetails.getValues(String field) method that
> returns an array of the values. The current only returns a
> single string, and Lucene indexes can have multiple values
> per field.
That sounds useful. Please submit a patch against the trunk attached to
a bug report.
> 2. In Link.java, put in a field (parentURL) for the URL of the page that
> contains the link. Right now it seems we just have the links themselves
> and we can't backtrack where they come from. Being able to backtrack
> through the links is handy for doing something like categorization. For
> example, you see that all the links are coming from a page about poodles,
> so you might categorize the linked page as a poodle page. It might also
> come in handy for doing something like a Google TrustRank scoring, where
> you penalize certain sites if they're a known link farm, or boost them
> if they're
> from some place respected like DMOZ.
This would certainly be useful functionality. The link db has changed
substantially in the current trunk and there is no longer a class named
Link. This has been replaced with Inlink and Outlink. Have a look at
the trunk and see if what you need isn't already there.
> 3. Get sorting to work on multiple fields. Lucene already works on
> multiple fields so it shouldn't be difficult to get this working. Just
> change the places where is passes down String field so that it
> accepts an array. The sort fields could be read from the query
> string in order:
>
> search.jsp?sort=score&reverse=true&sort=date&reverse=false
This would also be useful. Please submit a patch against the trunk.
Thanks!
Doug
Re: Ideas for enhancements
Posted by Stefan Groschupf <sg...@media-style.com>.
Hi Howie,
> Howie Wang wrote:
>> 1. A String[] HitDetails.getValues(String field) method that
>> returns an array of the values. The current only returns a
>> single string, and Lucene indexes can have multiple values
>> per field.
>
> That sounds useful. Please submit a patch against the trunk
> attached to a bug report.
Any work already done for this? I would love to have multiple values
and if there is nothing done yet I would love to create such a patch.
Thanks.
Stefan