You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Hangthunder <ji...@gmail.com> on 2012/03/26 06:54:51 UTC

Re: [ANNOUNCEMENT] Lewis John Mc Gibbney is a Nutch committer and PMC member

Hi,  Lewis,

I got a question for you.

I used nutch 1.4 to crawl a web site, and then indexed to Solr 3.5. This was
successful. I checked the index using Luke, it shows 1678 documents were
fetched. but when I entered a query string (one or two words) in the solr
interface for a search, all 1678 web pages (documents) were retrieved.
actually most page documents  
do not contain the string words at all. what is the problem? could you give
me some idea?

my email: jiajin.lei@gmail.com

Thank you very much!

Jiajin

--
View this message in context: http://lucene.472066.n3.nabble.com/ANNOUNCEMENT-Lewis-John-Mc-Gibbney-is-a-Nutch-committer-and-PMC-member-tp3120688p3857334.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: [ANNOUNCEMENT] Lewis John Mc Gibbney is a Nutch committer and PMC member

Posted by Lewis John Mcgibbney <le...@gmail.com>.
we also maintain a number of plugins which you may wish to use to store
more rich representational metadata about your webpage content. Check these
out.

hth

On Tue, Mar 27, 2012 at 11:02 AM, remi tassing <ta...@gmail.com>wrote:

> Try this:
>
> http://wiki.apache.org/solr/FAQ#My_search_returns_too_many_.2BAC8_too_little_.2BAC8_unexpected_results.2C_how_to_debug.3F
>
> Solr also has a debug mode where you can see result's score etc...
>
> On Mon, Mar 26, 2012 at 12:54 PM, Hangthunder <ji...@gmail.com>
> wrote:
>
> > Hi,  Lewis,
> >
> > I got a question for you.
> >
> > I used nutch 1.4 to crawl a web site, and then indexed to Solr 3.5. This
> > was
> > successful. I checked the index using Luke, it shows 1678 documents were
> > fetched. but when I entered a query string (one or two words) in the solr
> > interface for a search, all 1678 web pages (documents) were retrieved.
> > actually most page documents
> > do not contain the string words at all. what is the problem? could you
> give
> > me some idea?
> >
> > my email: jiajin.lei@gmail.com
> >
> > Thank you very much!
> >
> > Jiajin
> >
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/ANNOUNCEMENT-Lewis-John-Mc-Gibbney-is-a-Nutch-committer-and-PMC-member-tp3120688p3857334.html
> > Sent from the Nutch - User mailing list archive at Nabble.com.
> >
>



-- 
*Lewis*

Re: [ANNOUNCEMENT] Lewis John Mc Gibbney is a Nutch committer and PMC member

Posted by remi tassing <ta...@gmail.com>.
Try this:
http://wiki.apache.org/solr/FAQ#My_search_returns_too_many_.2BAC8_too_little_.2BAC8_unexpected_results.2C_how_to_debug.3F

Solr also has a debug mode where you can see result's score etc...

On Mon, Mar 26, 2012 at 12:54 PM, Hangthunder <ji...@gmail.com> wrote:

> Hi,  Lewis,
>
> I got a question for you.
>
> I used nutch 1.4 to crawl a web site, and then indexed to Solr 3.5. This
> was
> successful. I checked the index using Luke, it shows 1678 documents were
> fetched. but when I entered a query string (one or two words) in the solr
> interface for a search, all 1678 web pages (documents) were retrieved.
> actually most page documents
> do not contain the string words at all. what is the problem? could you give
> me some idea?
>
> my email: jiajin.lei@gmail.com
>
> Thank you very much!
>
> Jiajin
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/ANNOUNCEMENT-Lewis-John-Mc-Gibbney-is-a-Nutch-committer-and-PMC-member-tp3120688p3857334.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>